* [MAINTAINERS SUMMIT] [0/4] Common scenario for four proposals regarding regressions
@ 2024-06-13 8:22 Thorsten Leemhuis
2024-06-13 8:26 ` [MAINTAINERS SUMMIT] [1/4] Create written down guidelines for handling regressions Thorsten Leemhuis
` (4 more replies)
0 siblings, 5 replies; 107+ messages in thread
From: Thorsten Leemhuis @ 2024-06-13 8:22 UTC (permalink / raw)
To: ksummit
Lo! I prepared four proposals for the maintainers summit regarding
regressions I'll send in reply to this mail. They are somewhat related
and address different aspects of one scenario I see frequently in
different variations; so instead of repeating that scenario in slightly
modified form in each of the proposals, I'm putting it out here once:
---
A change makes it into mainline (case "a") right before a release (say
of 6.7) or (case "b") shortly after (e.g. during a merge window of 6.8).
That change then within a few days might even be backported to the
latest stable releases (case "c") deemed for end users (say 6.6.13 or
[in case "b"] to 6.7.3). Only then it becomes known that the change
causes a regression (e.g. in both mainline and the stable trees).
Once reported, the problem then is quickly debugged and within two or
three days a tested fix ready for review emerges[1]. But then that fix
only makes it to a new mainline -rc after more than two, three, or many
more weeks due to one or a combination of the following factors:
* Review takes a long time, as nobody feels urged to take a closer look
soon[2].
* Maintainers take quite some time to commit the fix to their subsystem
tree[2].
* Maintainers take quite some time to submit another PR to Linus or only
send the fix shortly before[2, 3] or after[4] the next proper mainline
release (e.g. 6.8).
A few days later the stable team then backports the fix (for case "a"
and "c" -- or "b", if the fix was only merged in the following merge
window) and after a stable-rc phase fixes the problem in their trees as
well[5] -- which takes at least three days, usually close to about one
week, and if the timing is bad easily 10 days or longer.
Despite an available fix, users of mainline then in the end were exposed
to the regression for many weeks[6] -- often more than 1 month, but I've
seen 2+ months quite a few times, too (e.g. when the culprit was merged
shortly before the 6.7 release and the fix only in the merge window for
6.9).
For users of stable trees it is often about as long or a little bit
longer depending on how well the mainline merge of the fix aligns with
the release of the next stable-rc[7]; that's because the stable team is
not allowed[8] or usually won't[9] do anything to resolve regressions
that also happen in mainline before a fix is mainlined.
[1] That's obviously not always the case, but surprisingly often, which
is great; thx for that!
[2] Because they are simply not aware that the patch fixes a regression
that bothers users or due to stances like "the next mainline release
is still weeks away".
[3] For example due to stances like "because I did not want to send
Linus a PR with just one fix"; I recently even had a case of "the
-next rules forbids to commit new changes during the merge window"
(which is not true when it comes to fixes) that delayed things.
[4] Disclaimer: for fixes that bear big risks or fixes for regressions
introduced more than a year ago waiting can be the right thing.
[5] Assuming the stable team notices that it's fixing a regression in
their trees.
[6] Which can discourage or hinder testers and CIs from testing and mean
that they miss other bugs affecting the same platform. IOW: even in
a simplified case of this scenario where the fix was not backported
to stable trees this would be a problem for some folks that could
easily be avoided by merging the fix faster.
[7] Until the issue is fixed, users thus have four options: (1) live
with the regression, which might be impossible if it breaks
something crucial like booting; (2) switch back to the
second-to-last stable series, which often will be EOL and thus prone
to vulnerabilities; (3) switch to a older longterm kernel series,
which might be impossible due to missing drivers or because the
culprit was backported there, too; (4) switch to a unstable kernel
(e.g. mainline) once the issue is fixed there, which they might be
afraid to do or if unlucky contains another regression that causes
trouble.
[8] Our rules forbid the stable maintainer to revert a mainline commit
or accept a fix for it before an equivalent revert or fix is
mainlined.
[9] The stable team could temporarily revert a backport of a mainline
commit that is causing a regression in both mainline and stable
which they later could reapply once a fix for it was mainlined --
but it almost never does that. Which I can understand, as that would
complicate things and might be unwise, as the commit might fix some
security issue; this approach also might be the right strategy in
general to ensure mainline is fixed quickly as well[6].
---
Ciao, Thorsten
^ permalink raw reply [flat|nested] 107+ messages in thread* [MAINTAINERS SUMMIT] [1/4] Create written down guidelines for handling regressions 2024-06-13 8:22 [MAINTAINERS SUMMIT] [0/4] Common scenario for four proposals regarding regressions Thorsten Leemhuis @ 2024-06-13 8:26 ` Thorsten Leemhuis 2024-09-12 13:33 ` Thorsten Leemhuis 2024-06-13 8:32 ` [MAINTAINERS SUMMIT] [2/4] Ensure recent mainline regression are fixed in latest stable series Thorsten Leemhuis ` (3 subsequent siblings) 4 siblings, 1 reply; 107+ messages in thread From: Thorsten Leemhuis @ 2024-06-13 8:26 UTC (permalink / raw) To: ksummit Different assumptions about the appropriate handling of regressions frequently lead to friction and time consuming discussions during my regression tracking and prodding work. That is frustrating, demotivating and exhausting for everyone involved and even brought us to situations like "then I'm stepping down as maintainer". To avoid things like this, I propose we try to pin down guidelines together and ideally make Linus bless them. The "Expectations and best practices for fixing regressions" in Documentation/process/handling-regressions.rst ( https://docs.kernel.org/process/handling-regressions.html#expectations-and-best-practices-for-fixing-regressions ) could be a start for such guidelines -- but I'm obviously biased here, as I wrote that text, so feel free to propose something new. That text is based on generalized interpretations of statements and actions from Linus while keeping practical application and our workflows in mind -- including the maintenance of stable trees. I have no idea if I went too far somewhere: the submission of that text was addressed to Linus, but he did not react; otoh he merged it later after Greg ACKed it and it came to his doorstep through the docs tree. But in the end it seems most people do not know about this text or do not take it for real. That's why I'd like to make this more official or create something new that is blessed and widely accepted, despite all the downsides that "writing things down" sometimes has. That means that the text likely should be moved somewhere else closer to Documentation/process/submitting-patches.rst and/or to Documentation/process/6.Followthrough.rst maybe. That "Expectations and best practices for fixing regressions" has a short section that basically says everything crucial in generic way already: > As a Linux kernel developer, you are expected to give your best to prevent > situations where a regression caused by a recent change of yours leaves users > only these options: > > * Run a kernel with a regression that impacts usage. > > * Switch to an older or newer kernel series. > > * Continue running an outdated and thus potentially insecure kernel for more > than three weeks after the regression's culprit was identified. Ideally it > should be less than two. And it ought to be just a few days, if the issue is > severe or affects many users -- either in general or in prevalent > environments. This thus should already prevent all variations of the example scenario the mail at the start of this thread covered. So maybe this is enough already. Fun fact: in an earlier version of that text (which was in mainline for about a year and also ACKed by Greg) it was more like "within two weeks, ideally one"; but that afaics turned out to be too demanding, especially for subsystem maintainers, as then they would definitely have to send PRs more often to reach that target. This and other practical aspects are also the reason why the text continues with more detailed instructions: > How to realize that in practice depends on various factors. Use the following > rules of thumb as a guide. Let me comment on some of the other points in case we want to use them as a base for guidelines. Again, many of the underlying problems that lead to the following points can be seen in the scenario at the start of the thread. > In general: > > * Prioritize work on regressions over all other Linux kernel work, unless the > latter concerns a severe issue (e.g. acute security vulnerability, data loss, > bricked hardware, ...). > > * Expedite fixing mainline regressions that recently made it into a proper > mainline, stable, or longterm release (either directly or via backport). Note: The "stable, or longterm release (either directly or via backport)" in this point is just from my interpretation, not sure what Linus thinks about it. Proposal 3/4 will focus on that, so maybe ignore this part here. Another note: That "recently" in the second point becomes more concrete in later points (see quotes below; yes, this is not really ideal and maybe should be fixed) and is based on https://lore.kernel.org/all/CAHk-=wis_qQy4oDNynNKi5b7Qhosmxtoj1jxo5wmB6SRUwQUBQ@mail.gmail.com/ > * Do not consider regressions from the current cycle as something that can wait > till the end of the cycle, as the issue might discourage or prevent users and > CI systems from testing mainline now or generally. Not sure if this and the two points before it are really what Linus wants, but from his actions it seemed to me it's something like that. > * Work with the required care to avoid additional or bigger damage, even if > resolving an issue then might take longer than outlined below. FWIW, this obviously provides a loophole that could be used in any situation -- but at the same time I think it's wise to have it here for cases where reverts are not an option and a proper fix takes time to get right. > On timing once the culprit of a regression is known: > > * Aim to mainline a fix within two or three days, if the issue is severe or > bothering many users -- either in general or in prevalent conditions like a > particular hardware environment, distribution, or stable/longterm series. Note, ignore... > * Aim to mainline a fix by Sunday after the next, if the culprit made it > into a recent mainline, stable, or longterm release (either directly or via > backport); if the culprit became known early during a week and is simple to > resolve, try to mainline the fix within the same week. ...this point here, as proposal 3/4 will cover that in more detail. > * For other regressions, aim to mainline fixes before the hindmost Sunday > within the next three weeks. One or two Sundays later are acceptable, if the > regression is something people can live with easily for a while -- like a > mild performance regression. FWIW, I chose this "Sunday after the next", "hindmost Sunday within the next three weeks", … approach so that subsystem maintainers normally have no extra work if they flush their fixes at the end of the week, which quite a few do. And when it comes to regression that imho is a wise approach, as that ensures the fix makes it into the next -rc. It often also aligns nicely with stable trees, as that way the fix one or two days later might already be in the next stable-rc, as they are often released on Mondays or Tuesdays (at least they usually did before Greg put even more load on his already overburdened shoulders with the CVE stuff a few months ago...). > * It's strongly discouraged to delay mainlining regression fixes till the next > merge window, except when the fix is extraordinarily risky or when the > culprit was mainlined more than a year ago. This used to be a big problem and already got somewhat better, but there is still quite a bit of room for improvement from what I see. > On procedure: > > * Always consider reverting the culprit, as it's often the quickest and least > dangerous way to fix a regression. Don't worry about mainlining a fixed > variant later: that should be straight-forward, as most of the code went > through review once already. This was meant to encourage reverts, as some people see them as something bad -- when in reality it in kernel context Linus afaics wants them to be seen as "this was not ready, no big deal, just revert and reapply in a few weeks together with a fix". Wondering how if we should do something to get that better across. > * Try to resolve any regressions introduced in mainline during the past > twelve months before the current development cycle ends: Linus wants such > regressions to be handled like those from the current cycle, unless fixing > bears unusual risks. > > * Consider CCing Linus on discussions or patch review, if a regression seems > tangly. Do the same in precarious or urgent cases -- especially if the > subsystem maintainer might be unavailable. Also CC the stable team, when you > know such a regression made it into a mainline, stable, or longterm release. > > * For urgent regressions, consider asking Linus to pick up the fix straight > from the mailing list: he is totally fine with that for uncontroversial > fixes. Ideally though such requests should happen in accordance with the > subsystem maintainers or come directly from them. > > * In case you are unsure if a fix is worth the risk applying just days before > a new mainline release, send Linus a mail with the usual lists and people in > CC; in it, summarize the situation while asking him to consider picking up > the fix straight from the list. He then himself can make the call and when > needed even postpone the release. Such requests again should ideally happen > in accordance with the subsystem maintainers or come directly from them. > > Regarding stable and longterm kernels: > > * You are free to leave regressions to the stable team, if they at no point in > time occurred with mainline or were fixed there already. Ignore... > * If a regression made it into a proper mainline release during the past > twelve months, ensure to tag the fix with "Cc: stable@vger.kernel.org", as a > "Fixes:" tag alone does not guarantee a backport. Please add the same tag, > in case you know the culprit was backported to stable or longterm kernels. ...this point here, as proposal 2/4 will cover it. > * When receiving reports about regressions in recent stable or longterm kernel > series, please evaluate at least briefly if the issue might happen in current > mainline as well -- and if that seems likely, take hold of the report. If in > doubt, ask the reporter to check mainline. > > * Whenever you want to swiftly resolve a regression that recently also made it > into a proper mainline, stable, or longterm release, fix it quickly in > mainline; when appropriate thus involve Linus to fast-track the fix (see > above). That's because the stable team normally does neither revert nor fix > any changes that cause the same problems in mainline. > > * In case of urgent regression fixes you might want to ensure prompt > backporting by dropping the stable team a note once the fix was mainlined; > this is especially advisable during merge windows and shortly thereafter, as > the fix otherwise might land at the end of a huge patch queue. > > On patch flow: > > * Developers, when trying to reach the time periods mentioned above, remember > to account for the time it takes to get fixes tested, reviewed, and merged by > Linus, ideally with them being in linux-next at least briefly. Hence, if a > fix is urgent, make it obvious to ensure others handle it appropriately. > > * Reviewers, you are kindly asked to assist developers in reaching the time > periods mentioned above by reviewing regression fixes in a timely manner. > > * Subsystem maintainers, FWIW, there is one problem related to this and the previous point that I haven't written a proposal for, but maybe should have: reviewers and subsystem maintainers have no dead simple and reliable way to detect "ohh, this is a regression fix I maybe should prioritize". Some agreed on tag in the subject could help. [REGFIX]? [URGENT]? [FASTTRACK]? Hmmm, do not really like any of them, except maybe the last... :-/ A `Cc: regressions@lists.linux.dev` as well, but would be yet another tag and harder to spot. :-/ Yes, I mentioned the latter idea two years ago already without success. But some people started doing it since then, which is nice, as it helps me keep an eye on things or to become aware of regressions; it would also allow me to easily spot regression fixes that are queued for the next cycle that instead might better be merged in the current cycle. > you likewise are encouraged to expedite the handling > of regression fixes. Thus evaluate if skipping linux-next is an option for > the particular fix. Also consider sending git pull requests more often than > usual when needed. And try to avoid holding onto regression fixes over > weekends -- especially when the fix is marked for backporting. If all subsystems would usually abide by this point, many regressions would be fixed quite a bit faster from what I see, as "fixes are sitting in subsystem trees for one or more weeks and are sometimes flushed shortly after a new rc" is the aspect that from my point of view currently is the one that is causing most delays (note: for a few subsystems that is not a problem at all, they are good at this!). ^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [MAINTAINERS SUMMIT] [1/4] Create written down guidelines for handling regressions 2024-06-13 8:26 ` [MAINTAINERS SUMMIT] [1/4] Create written down guidelines for handling regressions Thorsten Leemhuis @ 2024-09-12 13:33 ` Thorsten Leemhuis 0 siblings, 0 replies; 107+ messages in thread From: Thorsten Leemhuis @ 2024-09-12 13:33 UTC (permalink / raw) To: ksummit On 13.06.24 10:26, Thorsten Leemhuis wrote: > Different assumptions about the appropriate handling of regressions > frequently lead to friction and time consuming discussions during my > regression tracking and prodding work. That is frustrating, demotivating > and exhausting for everyone involved and even brought us to situations > like "then I'm stepping down as maintainer". To avoid things like this, > I propose we try to pin down guidelines together and ideally make Linus > bless them. > > The "Expectations and best practices for fixing regressions" in > Documentation/process/handling-regressions.rst ( > https://docs.kernel.org/process/handling-regressions.html#expectations-and-best-practices-for-fixing-regressions > ) could be a start for such guidelines -- but I'm obviously biased here, > as I wrote that text, so feel free to propose something new. > > That text is based on generalized interpretations of statements and > actions from Linus while keeping practical application and our workflows > in mind -- including the maintenance of stable trees. I have no idea if > I went too far somewhere: the submission of that text was addressed to > Linus, but he did not react; otoh he merged it later after Greg ACKed it > and it came to his doorstep through the docs tree. > > But in the end it seems most people do not know about this text or do > not take it for real. [...] Lo! The discussion here rightfully exposed that the wording regarding the stable tag was way to strong. Sorry for that, not sure how that happened, that was not my intend. That and a few other aspects (some from the discussions here) made me revisit the text regarding "Expectations and best practices for fixing regressions". See below for my current draft (the diff view is not really helpful, sorry). Note, should be easy to add a week or two to any sections regarding the timing aspects; guess that is best discussed on the summit. As I said earlier, the text is based on generalized interpretations of statements and actions from Linus with some interpolation. But in some areas what I wrote might be not what Linus wants. To sort this out I'm currently also preparing a few scenarios with related questions for the maintainers summit audience (incl. Linus) that hopefully will help to keep the discussion fruitful, targeted, and as short as possible. More on that on Tuesday. Ciao, Thorsten --- Expectations and best practices for fixing regressions ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Try to quickly resolve regressions in mainline while applying reasonable care to prevent additional problems. The appropriate balance depends on the situation; most regressions should ideally be resolved through a fix or a revert by the last Sunday within the next two or three weeks after the culprit was identified. The rules of thumb below outline the appropriate procedure in more detail. The overall goal is to prevent situations where a regression caused by a recent change leaves users only three bad options: use a kernel with a regression that impacts usage, switch to a different kernel series, or run an outdated and thus potentially insecure kernel for more than two or three weeks. In general: * Prioritize work on providing, reviewing, and mainlining regression fixes over other upstream Linux kernel work, unless the latter concerns severe issues (e.g. acute security vulnerabilities, data loss, or bricked hardware). * Do not consider fixing regressions from the current development cycle as something that can wait till its end: the issue possibly prevents users or CI systems from testing, which might drive testers away and mask other bugs. * When developing a fix, apply the required care to avoid additional damage. Do so even when resolving a regression might take longer than outlined below -- at least unless a revert could resolve it, as then you should opt for one. * Reviewers and maintainers likewise should apply the required care, but at the same time should try to route regression fixes quickly through the ranks. On timing once the change causing the regression became known: * If the regression is severe, aim to mainline a fix within two or three work days and ideally before the next Sunday; do the same it its is bothering many users in general or most people in prevalent environments (say a widespread hardware device, a popular Linux distribution, or a stable/longterm series). * Aim to mainline a fix by Sunday after the next, if the culprit made it into a kernel deemed for end users during the past three months -- either directly through a mainline release or through backports to stable or longterm series. If the culprit became known early during a week while being simple to resolve using a low-risk patch, try to mainline the fix within the same week instead. * For other regressions introduced during the past twelve months, aim to mainline a fix before the hindmost Sunday within the next three weeks. One or two weeks later are acceptable, if the regression is unlikely to bother more than a user or two or is something people can easily live with temporarily. * Try your best to mainline a fix before the current development cycle ends, unless the culprit was committed more than a year ago: then it is acceptable to queue a fix for the next merge window, which definitely should be done in case it bear bigger risks. On patch flow to mainline: * Developers, when trying to reach the time periods mentioned above, remember to account for the time it will take to test, review, commit, and mainline fixes, ideally with them being in linux-next at least briefly. Hence, if fixes are urgent, make it obvious to ensure others handle them appropriately. * Reviewers, you are kindly asked to assist developers in reaching the time periods mentioned above by reviewing regression fixes in a timely manner. * Maintainers, you likewise are kindly asked to expedite the handling of regression fixes. Thus when beneficial evaluate if skipping linux-next might be an option. Also consider sending git pull requests more often than usual when appropriate. And try to avoid holding onto regression fixes over weekends -- especially when some are marked for backporting to stable series. On procedure: * If a regression seems tangly, precarious or urgent, consider CCing Linus on discussions or patch review; do the same if the responsible maintainers suspected to be unavailable. * For an urgent regression, consider asking Linus to pick up a fix straight from the mailing list: he is totally fine with that for uncontroversial fixes. Such requests should ideally come directly from maintainers or happen in accordance with them. * In case you are unsure if a fix is worth the risk applying just days before a new mainline release, send Linus a mail with the usual lists and developers in CC; in it, summarize the situation while asking to pick up the fix straight from the list. Linus then can make the call and when appropriate even postpone the release. Such requests again should ideally come directly from maintainers or happen in accordance with them. On tagging in the patch description: * Include the tags Documentation/process/submitting-patches.rst mentions for regressions; this usually means a "Reported-by:" tag followed by "Link:" or "Closes:" tag pointing to the report as well as a "Fixes:" tag; if it's a regression a later change exposed, add a "Fixes:" tag for that one, too. * Did the culprit make it into a proper mainline release during the past twelve months? Or is it a recent mainline commit backported to stable or longterm releases in the past few weeks? Then you are kindly asked to ensure stable inclusion as described by Documentation/process/stable-kernel-rules.rst, e.g. by adding a "Cc: stable@vger.kernel.org" to the patch description. Note, a "Fixes:" tag alone does not guarantee a backport: the stable team sometimes silently drop such changes, for example when they do not apply cleanly. Regarding stable and longterm kernels: * When receiving reports about regressions in recent stable or longterm kernel series, please consider evaluating at least briefly, if the issue might happen in current mainline as well -- and if that seems likely, take hold of the report. If in doubt, ask the reporter to check mainline. * You are free to leave handling regressions to the stable team, if the problem at no point in time occurred with mainline or was fixed there already. * Whenever you want to swiftly resolve a mainline regression that recently made it into a mainline, stable, or longterm release, fix it quickly in mainline; in urgent cases thus involve Linus to fast-track fixes (see above). That's required, as the stable team normally does neither revert nor fix any changes in their trees as long as those cause the same problem in mainline. * In case of urgent fixes for regression affecting stable or longterm kernels, you might want to ensure prompt backporting by dropping the stable team a note once the fix was mainlined; this is especially advisable during merge windows and shortly thereafter, as the fix otherwise might land at the end of a huge patch queue. ^ permalink raw reply [flat|nested] 107+ messages in thread
* [MAINTAINERS SUMMIT] [2/4] Ensure recent mainline regression are fixed in latest stable series 2024-06-13 8:22 [MAINTAINERS SUMMIT] [0/4] Common scenario for four proposals regarding regressions Thorsten Leemhuis 2024-06-13 8:26 ` [MAINTAINERS SUMMIT] [1/4] Create written down guidelines for handling regressions Thorsten Leemhuis @ 2024-06-13 8:32 ` Thorsten Leemhuis 2024-06-13 11:02 ` Johannes Berg ` (3 more replies) 2024-06-13 8:34 ` [MAINTAINERS SUMMIT] [3/4] Elevate handling of regressions that made it to releases deemed for end users Thorsten Leemhuis ` (2 subsequent siblings) 4 siblings, 4 replies; 107+ messages in thread From: Thorsten Leemhuis @ 2024-06-13 8:32 UTC (permalink / raw) To: ksummit I propose we extend the implications of the "no regressions" rule so that mainline developers must ensure fixes for recent mainline regression make it to the latest stable series. [FWIW, yes I'm well aware that this is a bold proposal; I also have no idea how even Linus thinks about the idea. But I'm bringing it up anyway to at least discuss this, as from my point of view it would fix what I consider a kind of loophole regarding our "no regressions" rule -- at least from the point of view of the users.] We might have a "no regressions" rule, but nothing currently makes sure that regressions introduced recently are fixed in a timely manner in the latest stable series. Hence a fix for a regression found just hours after a new mainline release (say 6.7) might only reach users weeks later with its successor (e.g. 6.8) -- or in unlucky cases when the fix is only merged in the next merge window and not backported only with the second successor (6.9). The example scenario at the start of this thread illustrates that in more details. To improve this situation I propose we add a rule like the following somewhere: """Developers must ensure that fixes for regressions introduced in the last development cycle make it to the latest stable series -- typically by adding 'Fixes:' and 'CC: <stable…' tags to the patch description's footer.""" I know I'm asking a lot here, especially from the file system folks due to the testing this will require. And I fully understand the participation in stable maintenance always has been and still is optional for mainline developers -- and that this would change it. But I'm bringing this up anyway, as users afaics expect "fix recently introduced problems with new minor releases', as it is a pretty normal thing in most other FLOSS projects that have minor releases. A lot of kernel developers are already doing what I proposed anyway. It could be viewed as fair to our user base, too. And without it the "no regressions" rule might be considered hollow. Note, to quickly fix such regression in the latest stable series such regressions obviously must first be fixed in a timely manner in mainline; that aspect is ignored here, as proposals 3/4 of this thread will covers that. Another note: the "Expectations and best practices for fixing regressions" in Documentation/process/handling-regressions.rst (see [1/4] kind of covers this already. But it does not use the term "must"; at the same time the scope is bigger, too, which is partly due to a statement from Linus[1]: """If a regression made it into a proper mainline release during the past twelve months, ensure to tag the fix with "Cc: stable@vger.kernel.org", as a "Fixes:" tag alone does not guarantee a backport. Please add the same tag, in case you know the culprit was backported to stable or longterm kernels."""" [1] https://lore.kernel.org/all/CAHk-=wis_qQy4oDNynNKi5b7Qhosmxtoj1jxo5wmB6SRUwQUBQ@mail.gmail.com/ Side note: due to the [1] above the rule this messages proposes above maybe needs 's/introduced in the last development cycle/introduced in mainline versions released during the past 12 months/" (or five or six releases instead, as that is easier to keep track of?). But I guess with that this proposal likely would be even less welcomed. :-/ ^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [MAINTAINERS SUMMIT] [2/4] Ensure recent mainline regression are fixed in latest stable series 2024-06-13 8:32 ` [MAINTAINERS SUMMIT] [2/4] Ensure recent mainline regression are fixed in latest stable series Thorsten Leemhuis @ 2024-06-13 11:02 ` Johannes Berg 2024-06-13 11:21 ` Greg KH 2024-06-13 11:17 ` Jiri Kosina ` (2 subsequent siblings) 3 siblings, 1 reply; 107+ messages in thread From: Johannes Berg @ 2024-06-13 11:02 UTC (permalink / raw) To: Thorsten Leemhuis, ksummit On Thu, 2024-06-13 at 10:32 +0200, Thorsten Leemhuis wrote: > > I know I'm asking a lot here, especially from the file system folks due > to the testing this will require. And I fully understand the > participation in stable maintenance always has been and still is > optional for mainline developers -- and that this would change it. > > But I'm bringing this up anyway, as users afaics expect "fix recently > introduced problems with new minor releases' You are saying that users can have it both ways: not test each release, but actually get fixes in each release... So no, I strongly object to putting *even* more work onto maintainers, basically making us all responsible for stable releases. johannes ^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [MAINTAINERS SUMMIT] [2/4] Ensure recent mainline regression are fixed in latest stable series 2024-06-13 11:02 ` Johannes Berg @ 2024-06-13 11:21 ` Greg KH 2024-06-13 13:18 ` Sasha Levin 0 siblings, 1 reply; 107+ messages in thread From: Greg KH @ 2024-06-13 11:21 UTC (permalink / raw) To: Johannes Berg; +Cc: Thorsten Leemhuis, ksummit On Thu, Jun 13, 2024 at 01:02:44PM +0200, Johannes Berg wrote: > On Thu, 2024-06-13 at 10:32 +0200, Thorsten Leemhuis wrote: > > > > I know I'm asking a lot here, especially from the file system folks due > > to the testing this will require. And I fully understand the > > participation in stable maintenance always has been and still is > > optional for mainline developers -- and that this would change it. > > > > But I'm bringing this up anyway, as users afaics expect "fix recently > > introduced problems with new minor releases' > > You are saying that users can have it both ways: not test each release, > but actually get fixes in each release... > > So no, I strongly object to putting *even* more work onto maintainers, > basically making us all responsible for stable releases. I also agree. Remember, the FIRST rule of us doing a stable release at all was that we would NOT put any extra work on any maintainer or developer that did not want to do anything extra. Let's not change that please. thanks, greg k-h ^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [MAINTAINERS SUMMIT] [2/4] Ensure recent mainline regression are fixed in latest stable series 2024-06-13 11:21 ` Greg KH @ 2024-06-13 13:18 ` Sasha Levin 0 siblings, 0 replies; 107+ messages in thread From: Sasha Levin @ 2024-06-13 13:18 UTC (permalink / raw) To: Greg KH; +Cc: Johannes Berg, Thorsten Leemhuis, ksummit On Thu, Jun 13, 2024 at 01:21:15PM +0200, Greg KH wrote: >On Thu, Jun 13, 2024 at 01:02:44PM +0200, Johannes Berg wrote: >> On Thu, 2024-06-13 at 10:32 +0200, Thorsten Leemhuis wrote: >> > >> > I know I'm asking a lot here, especially from the file system folks due >> > to the testing this will require. And I fully understand the >> > participation in stable maintenance always has been and still is >> > optional for mainline developers -- and that this would change it. >> > >> > But I'm bringing this up anyway, as users afaics expect "fix recently >> > introduced problems with new minor releases' >> >> You are saying that users can have it both ways: not test each release, >> but actually get fixes in each release... >> >> So no, I strongly object to putting *even* more work onto maintainers, >> basically making us all responsible for stable releases. > >I also agree. Remember, the FIRST rule of us doing a stable release at >all was that we would NOT put any extra work on any maintainer or >developer that did not want to do anything extra. Let's not change that >please. Nor is this something we want to start policing on our end. What happens if someone breaks this rule? Do we ban them from sending stuff upstream? -- Thanks, Sasha ^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [MAINTAINERS SUMMIT] [2/4] Ensure recent mainline regression are fixed in latest stable series 2024-06-13 8:32 ` [MAINTAINERS SUMMIT] [2/4] Ensure recent mainline regression are fixed in latest stable series Thorsten Leemhuis 2024-06-13 11:02 ` Johannes Berg @ 2024-06-13 11:17 ` Jiri Kosina 2024-06-13 11:28 ` Laurent Pinchart 2024-06-14 14:01 ` Mark Brown 3 siblings, 0 replies; 107+ messages in thread From: Jiri Kosina @ 2024-06-13 11:17 UTC (permalink / raw) To: Thorsten Leemhuis; +Cc: ksummit On Thu, 13 Jun 2024, Thorsten Leemhuis wrote: > I propose we extend the implications of the "no regressions" rule so > that mainline developers must ensure fixes for recent mainline > regression make it to the latest stable series. Sorry, but I am personally very strongly against that. As I maintainer, I never felt responsibile for -stable tree, and I believe this is the case for many others (please feel free to speak up if you disagree). My only objective is to have all the features and fixes land in mainline in a timely manner and good quality. This is definitely not a way how to avoid maintainer burnout, quite the contrary. -- Jiri Kosina SUSE Labs ^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [MAINTAINERS SUMMIT] [2/4] Ensure recent mainline regression are fixed in latest stable series 2024-06-13 8:32 ` [MAINTAINERS SUMMIT] [2/4] Ensure recent mainline regression are fixed in latest stable series Thorsten Leemhuis 2024-06-13 11:02 ` Johannes Berg 2024-06-13 11:17 ` Jiri Kosina @ 2024-06-13 11:28 ` Laurent Pinchart 2024-06-14 0:50 ` Steven Rostedt 2024-06-14 14:01 ` Mark Brown 3 siblings, 1 reply; 107+ messages in thread From: Laurent Pinchart @ 2024-06-13 11:28 UTC (permalink / raw) To: Thorsten Leemhuis; +Cc: ksummit Hi Thorsten, On Thu, Jun 13, 2024 at 10:32:27AM +0200, Thorsten Leemhuis wrote: > I propose we extend the implications of the "no regressions" rule so > that mainline developers must ensure fixes for recent mainline > regression make it to the latest stable series. > > [FWIW, yes I'm well aware that this is a bold proposal; I also have no > idea how even Linus thinks about the idea. But I'm bringing it up anyway > to at least discuss this, as from my point of view it would fix what I > consider a kind of loophole regarding our "no regressions" rule -- at > least from the point of view of the users.] > > We might have a "no regressions" rule, but nothing currently makes sure > that regressions introduced recently are fixed in a timely manner in the > latest stable series. Hence a fix for a regression found just hours > after a new mainline release (say 6.7) might only reach users weeks > later with its successor (e.g. 6.8) -- or in unlucky cases when the fix > is only merged in the next merge window and not backported only with the > second successor (6.9). The example scenario at the start of this thread > illustrates that in more details. > > To improve this situation I propose we add a rule like the following > somewhere: > > """Developers must ensure that fixes for regressions introduced in the > last development cycle make it to the latest stable series -- typically > by adding 'Fixes:' and 'CC: <stable…' tags to the patch description's > footer.""" I think there's a general agreement that those tags are useful, should be used, and are already widely used. Reminding everybody, be they maintainers or not, is fine with me. Making this an extra strict duty for maintainers, however, is something I can't support. We already have a bad maintainer burnout problem, and this would make it worse, resulting in a worse long term outcome in my opinion. I would be more interested in exploring why regression fixes don't end up in stable releases in a timely manner, and seeing how we could improve that at no cost for maintainers. We may even be able to come up with processes and tools that, when used right, would save time for maintainers. That would have a higher chance of getting broader adoption. > I know I'm asking a lot here, especially from the file system folks due > to the testing this will require. And I fully understand the > participation in stable maintenance always has been and still is > optional for mainline developers -- and that this would change it. > > But I'm bringing this up anyway, as users afaics expect "fix recently > introduced problems with new minor releases', as it is a pretty normal > thing in most other FLOSS projects that have minor releases. A lot of > kernel developers are already doing what I proposed anyway. It could be > viewed as fair to our user base, too. And without it the "no > regressions" rule might be considered hollow. > > Note, to quickly fix such regression in the latest stable series such > regressions obviously must first be fixed in a timely manner in > mainline; that aspect is ignored here, as proposals 3/4 of this thread > will covers that. > > Another note: the "Expectations and best practices for fixing > regressions" in Documentation/process/handling-regressions.rst (see > [1/4] kind of covers this already. But it does not use the term "must"; > at the same time the scope is bigger, too, which is partly due to a > statement from Linus[1]: """If a regression made it into a proper > mainline release during the past twelve months, ensure to tag the fix > with "Cc: stable@vger.kernel.org", as a "Fixes:" tag alone does not > guarantee a backport. Please add the same tag, in case you know the > culprit was backported to stable or longterm kernels."""" > [1] > https://lore.kernel.org/all/CAHk-=wis_qQy4oDNynNKi5b7Qhosmxtoj1jxo5wmB6SRUwQUBQ@mail.gmail.com/ > > Side note: due to the [1] above the rule this messages proposes above > maybe needs 's/introduced in the last development cycle/introduced in > mainline versions released during the past 12 months/" (or five or six > releases instead, as that is easier to keep track of?). But I guess with > that this proposal likely would be even less welcomed. :-/ -- Regards, Laurent Pinchart ^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [MAINTAINERS SUMMIT] [2/4] Ensure recent mainline regression are fixed in latest stable series 2024-06-13 11:28 ` Laurent Pinchart @ 2024-06-14 0:50 ` Steven Rostedt 0 siblings, 0 replies; 107+ messages in thread From: Steven Rostedt @ 2024-06-14 0:50 UTC (permalink / raw) To: Laurent Pinchart; +Cc: Thorsten Leemhuis, ksummit On Thu, 13 Jun 2024 14:28:48 +0300 Laurent Pinchart <laurent.pinchart@ideasonboard.com> wrote: > Hi Thorsten, Hi Thorsten, I'm sure you had your flame suit on when you posted this ;-) > > On Thu, Jun 13, 2024 at 10:32:27AM +0200, Thorsten Leemhuis wrote: > > I propose we extend the implications of the "no regressions" rule so > > that mainline developers must ensure fixes for recent mainline > > regression make it to the latest stable series. > > > > [FWIW, yes I'm well aware that this is a bold proposal; I also have no > > idea how even Linus thinks about the idea. But I'm bringing it up anyway > > to at least discuss this, as from my point of view it would fix what I > > consider a kind of loophole regarding our "no regressions" rule -- at > > least from the point of view of the users.] > > > > We might have a "no regressions" rule, but nothing currently makes sure > > that regressions introduced recently are fixed in a timely manner in the > > latest stable series. Hence a fix for a regression found just hours > > after a new mainline release (say 6.7) might only reach users weeks > > later with its successor (e.g. 6.8) -- or in unlucky cases when the fix > > is only merged in the next merge window and not backported only with the > > second successor (6.9). The example scenario at the start of this thread > > illustrates that in more details. > > > > To improve this situation I propose we add a rule like the following > > somewhere: > > > > """Developers must ensure that fixes for regressions introduced in the > > last development cycle make it to the latest stable series -- typically > > by adding 'Fixes:' and 'CC: <stable…' tags to the patch description's > > footer.""" > > I think there's a general agreement that those tags are useful, should > be used, and are already widely used. Reminding everybody, be they > maintainers or not, is fine with me. Making this an extra strict duty > for maintainers, however, is something I can't support. We already have > a bad maintainer burnout problem, and this would make it worse, > resulting in a worse long term outcome in my opinion. > > I would be more interested in exploring why regression fixes don't end > up in stable releases in a timely manner, and seeing how we could > improve that at no cost for maintainers. We may even be able to come up > with processes and tools that, when used right, would save time for > maintainers. That would have a higher chance of getting broader > adoption. When reading this thread I was thinking somewhat the same thing. I like knowing about regressions, and having a way to track them. What would really be helpful is to have more ways to be able to catch regressions, and possibly better tooling to find where they started. I think the focus on this is to make it easier for maintainers to see there's a regression and where it started. But there should not be any requirement that the maintainer must deal with it. It could be something that others working in that subsystem could track. This could be used for those that want to start kernel development and keep asking us "do you have any todo list?". Well, this looks like the perfect todo list for people to take on. -- Steve ^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [MAINTAINERS SUMMIT] [2/4] Ensure recent mainline regression are fixed in latest stable series 2024-06-13 8:32 ` [MAINTAINERS SUMMIT] [2/4] Ensure recent mainline regression are fixed in latest stable series Thorsten Leemhuis ` (2 preceding siblings ...) 2024-06-13 11:28 ` Laurent Pinchart @ 2024-06-14 14:01 ` Mark Brown 2024-06-14 14:32 ` Rafael J. Wysocki 3 siblings, 1 reply; 107+ messages in thread From: Mark Brown @ 2024-06-14 14:01 UTC (permalink / raw) To: Thorsten Leemhuis; +Cc: ksummit [-- Attachment #1: Type: text/plain, Size: 1997 bytes --] On Thu, Jun 13, 2024 at 10:32:27AM +0200, Thorsten Leemhuis wrote: > I propose we extend the implications of the "no regressions" rule so > that mainline developers must ensure fixes for recent mainline > regression make it to the latest stable series. I do note that there is already a bunch of disquiet about what makes it into stable... > """Developers must ensure that fixes for regressions introduced in the > last development cycle make it to the latest stable series -- typically > by adding 'Fixes:' and 'CC: <stable…' tags to the patch description's > footer.""" Personally I stopped bothering with manually Ccing stable because the stable team already picks up much more than I'm comfortable with, devoting any effort to thinking about what might go to stable just doesn't seem like a good use of time. We also already have problems with people spamming fixes tags onto things that are not really bugs or where not much effort appears to have gone into identifying a relevant commit, I think some people have internal process pressures on having Fixes tags for the sake of it. Demanding that people who don't really care fill in the blank to appease some workflow strategist doesn't seem likely to improve the quality of information provided any. > But I'm bringing this up anyway, as users afaics expect "fix recently > introduced problems with new minor releases', as it is a pretty normal > thing in most other FLOSS projects that have minor releases. A lot of > kernel developers are already doing what I proposed anyway. It could be > viewed as fair to our user base, too. And without it the "no > regressions" rule might be considered hollow. I think a lot of projects have a much greater expectation that a large part of their audience will directly use their releases, while obviously people do directly take and run kernel releases the much more common path is via some third party that usually does some integration and QA work. [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 488 bytes --] ^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [MAINTAINERS SUMMIT] [2/4] Ensure recent mainline regression are fixed in latest stable series 2024-06-14 14:01 ` Mark Brown @ 2024-06-14 14:32 ` Rafael J. Wysocki 0 siblings, 0 replies; 107+ messages in thread From: Rafael J. Wysocki @ 2024-06-14 14:32 UTC (permalink / raw) To: Mark Brown; +Cc: Thorsten Leemhuis, ksummit On Fri, Jun 14, 2024 at 4:01 PM Mark Brown <broonie@kernel.org> wrote: > > On Thu, Jun 13, 2024 at 10:32:27AM +0200, Thorsten Leemhuis wrote: > > > I propose we extend the implications of the "no regressions" rule so > > that mainline developers must ensure fixes for recent mainline > > regression make it to the latest stable series. > > I do note that there is already a bunch of disquiet about what makes it > into stable... > > > """Developers must ensure that fixes for regressions introduced in the > > last development cycle make it to the latest stable series -- typically > > by adding 'Fixes:' and 'CC: <stable…' tags to the patch description's > > footer.""" > > Personally I stopped bothering with manually Ccing stable because the > stable team already picks up much more than I'm comfortable with, > devoting any effort to thinking about what might go to stable just > doesn't seem like a good use of time. Same here mostly except for 3 cases: - When I want to limit the scope of the backports. - When I want the patch to get into "stable" earlier than autosel would pick it up. - When there are dependencies I want "stable" to know about. > We also already have problems with people spamming fixes tags onto > things that are not really bugs or where not much effort appears to have > gone into identifying a relevant commit, I think some people have > internal process pressures on having Fixes tags for the sake of it. > Demanding that people who don't really care fill in the blank to appease > some workflow strategist doesn't seem likely to improve the quality of > information provided any. +1 ^ permalink raw reply [flat|nested] 107+ messages in thread
* [MAINTAINERS SUMMIT] [3/4] Elevate handling of regressions that made it to releases deemed for end users 2024-06-13 8:22 [MAINTAINERS SUMMIT] [0/4] Common scenario for four proposals regarding regressions Thorsten Leemhuis 2024-06-13 8:26 ` [MAINTAINERS SUMMIT] [1/4] Create written down guidelines for handling regressions Thorsten Leemhuis 2024-06-13 8:32 ` [MAINTAINERS SUMMIT] [2/4] Ensure recent mainline regression are fixed in latest stable series Thorsten Leemhuis @ 2024-06-13 8:34 ` Thorsten Leemhuis 2024-06-13 11:34 ` Laurent Pinchart 2024-06-13 8:42 ` [MAINTAINERS SUMMIT] [4/4] Discuss how to better prevent backports of commits that turn out to cause regressions Thorsten Leemhuis 2024-06-18 14:43 ` [MAINTAINERS SUMMIT] [0/4] Common scenario for four proposals regarding regressions James Bottomley 4 siblings, 1 reply; 107+ messages in thread From: Thorsten Leemhuis @ 2024-06-13 8:34 UTC (permalink / raw) To: ksummit I propose we elevate fixing of mainline regressions that made it to releases deemed for end users by setting a target to ideally mainline a fix (which might be a revert) within two weeks after the culprit was found. That IMHO would lessen one of the big pain points for users regarding regressions, as quite a few make it into proper release and then take quite a while to resolve (as shown in the scenario in the mail at the start of this thread). So much so that quite a few users afaics doubt that we take our "no regression" rule seriously. This is why I'd like to see such situations resolved even faster than regression that happen just in development kernels. "Expectations and best practices for fixing regressions" in Documentation/process/handling-regressions.rst (see [1/4] in this thread) kind of covers this already: """Expedite fixing mainline regressions that recently made it into a proper mainline, stable, or longterm release (either directly or via backport). [...] Aim to mainline a fix by Sunday after the next, if the culprit made it into a recent mainline, stable, or longterm release (either directly or via backport); if the culprit became known early during a week and is simple to resolve, try to mainline the fix within the same week. [...]""" I'd like to make the language somewhat stronger. """Handle mainline regressions that recently made it into a proper mainline, stable, or longterm release (either directly or via backport) with an even higher priority and try to fix them as fast as possible. [...] Aim hard to mainline a fix by Sunday after the next, if the culprit made it into a recent mainline, stable, or longterm release (either directly or via backport); try to mainline the fix within the same week, if the regression apparently bothers quite a few users or if the problem with the culprit became known on a Monday or Tuesday.""" ^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [MAINTAINERS SUMMIT] [3/4] Elevate handling of regressions that made it to releases deemed for end users 2024-06-13 8:34 ` [MAINTAINERS SUMMIT] [3/4] Elevate handling of regressions that made it to releases deemed for end users Thorsten Leemhuis @ 2024-06-13 11:34 ` Laurent Pinchart 2024-06-13 11:39 ` Jiri Kosina ` (2 more replies) 0 siblings, 3 replies; 107+ messages in thread From: Laurent Pinchart @ 2024-06-13 11:34 UTC (permalink / raw) To: Thorsten Leemhuis; +Cc: ksummit Hi Thorsten, On Thu, Jun 13, 2024 at 10:34:17AM +0200, Thorsten Leemhuis wrote: > I propose we elevate fixing of mainline regressions that made it to > releases deemed for end users by setting a target to ideally mainline a > fix (which might be a revert) within two weeks after the culprit was found. > > That IMHO would lessen one of the big pain points for users regarding > regressions, as quite a few make it into proper release and then take > quite a while to resolve (as shown in the scenario in the mail at the > start of this thread). So much so that quite a few users afaics doubt > that we take our "no regression" rule seriously. > > This is why I'd like to see such situations resolved even faster than > regression that happen just in development kernels. "Expectations and > best practices for fixing regressions" in > Documentation/process/handling-regressions.rst (see [1/4] in this > thread) kind of covers this already: > > """Expedite fixing mainline regressions that recently made it into a > proper mainline, stable, or longterm release (either directly or via > backport). [...] Aim to mainline a fix by Sunday after the next, if the > culprit made it into a recent mainline, stable, or longterm release > (either directly or via backport); if the culprit became known early > during a week and is simple to resolve, try to mainline the fix within > the same week. [...]""" > > I'd like to make the language somewhat stronger. > > """Handle mainline regressions that recently made it into a proper > mainline, stable, or longterm release (either directly or via backport) > with an even higher priority and try to fix them as fast as possible. > [...] Aim hard to mainline a fix by Sunday after the next, if the Are we really telling people, some of them contributing in their spare time, that they have to work during weekends ? I don't think piling pressure will help. What could help is to reduce pressure on already overloaded maintainers, to give them more time to handle regressions. There have been multiple discussions about co-maintainance models over the past few years, and some subsystems are (slowly) moving forward. I would be more interested in participating in that effort. It otherwise feels like we would just add pressure on an already overloaded system, without caring that the system has no reasonable way to absorb that pressure. > culprit made it into a recent mainline, stable, or longterm release > (either directly or via backport); try to mainline the fix within the > same week, if the regression apparently bothers quite a few users or if > the problem with the culprit became known on a Monday or Tuesday.""" -- Regards, Laurent Pinchart ^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [MAINTAINERS SUMMIT] [3/4] Elevate handling of regressions that made it to releases deemed for end users 2024-06-13 11:34 ` Laurent Pinchart @ 2024-06-13 11:39 ` Jiri Kosina 2024-06-14 14:10 ` Mark Brown 2024-06-13 15:56 ` Liam R. Howlett 2024-06-18 12:24 ` Thorsten Leemhuis 2 siblings, 1 reply; 107+ messages in thread From: Jiri Kosina @ 2024-06-13 11:39 UTC (permalink / raw) To: Laurent Pinchart; +Cc: Thorsten Leemhuis, ksummit On Thu, 13 Jun 2024, Laurent Pinchart wrote: > I don't think piling pressure will help. What could help is to reduce > pressure on already overloaded maintainers, to give them more time to > handle regressions. There have been multiple discussions about > co-maintainance models over the past few years, and some subsystems are > (slowly) moving forward. I would be more interested in participating in > that effort. Fully agreed. That's exactly why a few days ago I proposed the topic about exploring the options of making the merge tree more deep (by delegating more and making the co-maintainership model more prominent), as that in my view is the only available solution to the current maintainer pressure problem. > It otherwise feels like we would just add pressure on an already > overloaded system, without caring that the system has no reasonable way > to absorb that pressure. 100% agreed as well. Thanks, -- Jiri Kosina SUSE Labs ^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [MAINTAINERS SUMMIT] [3/4] Elevate handling of regressions that made it to releases deemed for end users 2024-06-13 11:39 ` Jiri Kosina @ 2024-06-14 14:10 ` Mark Brown 2024-06-18 12:58 ` Thorsten Leemhuis 0 siblings, 1 reply; 107+ messages in thread From: Mark Brown @ 2024-06-14 14:10 UTC (permalink / raw) To: Jiri Kosina; +Cc: Laurent Pinchart, Thorsten Leemhuis, ksummit [-- Attachment #1: Type: text/plain, Size: 1127 bytes --] On Thu, Jun 13, 2024 at 01:39:00PM +0200, Jiri Kosina wrote: > On Thu, 13 Jun 2024, Laurent Pinchart wrote: > > I don't think piling pressure will help. What could help is to reduce > > pressure on already overloaded maintainers, to give them more time to > > handle regressions. There have been multiple discussions about > > co-maintainance models over the past few years, and some subsystems are > > (slowly) moving forward. I would be more interested in participating in > > that effort. > Fully agreed. That's exactly why a few days ago I proposed the topic about > exploring the options of making the merge tree more deep (by delegating > more and making the co-maintainership model more prominent), as that in my > view is the only available solution to the current maintainer pressure > problem. In my experience deeper maintainer trees are often a factor in slowing down patches, passing things between maintainers often just inherently adds delays even if nobody goes on holiday or whatever. Group maintainership mitigates things like holidays but not things like stabalisation periods. [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 488 bytes --] ^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [MAINTAINERS SUMMIT] [3/4] Elevate handling of regressions that made it to releases deemed for end users 2024-06-14 14:10 ` Mark Brown @ 2024-06-18 12:58 ` Thorsten Leemhuis 2024-06-19 20:25 ` Laurent Pinchart 0 siblings, 1 reply; 107+ messages in thread From: Thorsten Leemhuis @ 2024-06-18 12:58 UTC (permalink / raw) To: Mark Brown, Jiri Kosina; +Cc: Laurent Pinchart, ksummit On 14.06.24 16:10, Mark Brown wrote: > On Thu, Jun 13, 2024 at 01:39:00PM +0200, Jiri Kosina wrote: >> On Thu, 13 Jun 2024, Laurent Pinchart wrote: > >>> I don't think piling pressure will help. What could help is to reduce >>> pressure on already overloaded maintainers, to give them more time to >>> handle regressions. There have been multiple discussions about >>> co-maintainance models over the past few years, and some subsystems are >>> (slowly) moving forward. I would be more interested in participating in >>> that effort. > >> Fully agreed. That's exactly why a few days ago I proposed the topic about >> exploring the options of making the merge tree more deep (by delegating >> more and making the co-maintainership model more prominent), as that in my >> view is the only available solution to the current maintainer pressure >> problem. > > In my experience deeper maintainer trees are often a factor in slowing > down patches, passing things between maintainers often just inherently > adds delays even if nobody goes on holiday or whatever. From what I see from the regressions perspective they are not ideal either. The slow down is one problem, unless the process is streamlined well. Another one from my biased point of view seems to be that a few of are far away from Linus and apparently not fully aware how he wants regressions to be handled. Which is not really surprising, as over the years there were quite a few cases where maintainers of core subsystems were not handled well either. But sooner or later that resulted in a clash with Linus[1] and from then on things worked better. For many sub-subsystem something like that never happened -- and the maintainers of the higher level subsystem can not have their eyes everywhere, so they do not notice such problems or are more lax and friendly. If I notice a regression is not handled well in a sub-subsystem I point it out (often in private) to the higher level maintainers. But that does it tedious, does not scale, and delays things. That's one of the reasons why written guidelines IMHO would be worth it. Ciao, Thorsten [1] see the quotes from Linus at the end of Documentation/process/handling-regressions.rst / https://docs.kernel.org/process/handling-regressions.html ^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [MAINTAINERS SUMMIT] [3/4] Elevate handling of regressions that made it to releases deemed for end users 2024-06-18 12:58 ` Thorsten Leemhuis @ 2024-06-19 20:25 ` Laurent Pinchart 2024-06-20 10:47 ` Thorsten Leemhuis 0 siblings, 1 reply; 107+ messages in thread From: Laurent Pinchart @ 2024-06-19 20:25 UTC (permalink / raw) To: Thorsten Leemhuis; +Cc: Mark Brown, Jiri Kosina, ksummit On Tue, Jun 18, 2024 at 02:58:38PM +0200, Thorsten Leemhuis wrote: > On 14.06.24 16:10, Mark Brown wrote: > > On Thu, Jun 13, 2024 at 01:39:00PM +0200, Jiri Kosina wrote: > >> On Thu, 13 Jun 2024, Laurent Pinchart wrote: > > > >>> I don't think piling pressure will help. What could help is to reduce > >>> pressure on already overloaded maintainers, to give them more time to > >>> handle regressions. There have been multiple discussions about > >>> co-maintainance models over the past few years, and some subsystems are > >>> (slowly) moving forward. I would be more interested in participating in > >>> that effort. > > > >> Fully agreed. That's exactly why a few days ago I proposed the topic about > >> exploring the options of making the merge tree more deep (by delegating > >> more and making the co-maintainership model more prominent), as that in my > >> view is the only available solution to the current maintainer pressure > >> problem. > > > > In my experience deeper maintainer trees are often a factor in slowing > > down patches, passing things between maintainers often just inherently > > adds delays even if nobody goes on holiday or whatever. > > From what I see from the regressions perspective they are not ideal > either. The slow down is one problem, unless the process is streamlined > well. Another one from my biased point of view seems to be that a few of > are far away from Linus and apparently not fully aware how he wants > regressions to be handled. > > Which is not really surprising, as over the years there were quite a few > cases where maintainers of core subsystems were not handled well either. > But sooner or later that resulted in a clash with Linus[1] and from then > on things worked better. For many sub-subsystem something like that > never happened -- and the maintainers of the higher level subsystem can > not have their eyes everywhere, so they do not notice such problems or > are more lax and friendly. This got me thinking: why don't we have trainings for maintainers, instead of expecting people to decypher a combination of unwritten rules, and written documentation containing conflicting and partly outdated information ? Of course, the whole path going from a first submission to the kernel to maintaining a subsystem is some sort of training (even if it looks like the kind of epic training the hero ninja/warrior/sorcerer will undergo on their perilous journey to saving the kingdom more than a process designed to optimize the end result), but at best that provides partial knowledge of the expected maintenance process, without even mentioning the issue of keeping the knowledge up to date. It sometimes feels we're discussing how to improve the process without even considering that many people are not even aware there is a process. > If I notice a regression is not handled well in a sub-subsystem I point > it out (often in private) to the higher level maintainers. But that does > it tedious, does not scale, and delays things. That's one of the reasons > why written guidelines IMHO would be worth it. > > Ciao, Thorsten > > [1] see the quotes from Linus at the end of > Documentation/process/handling-regressions.rst / > https://docs.kernel.org/process/handling-regressions.html -- Regards, Laurent Pinchart ^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [MAINTAINERS SUMMIT] [3/4] Elevate handling of regressions that made it to releases deemed for end users 2024-06-19 20:25 ` Laurent Pinchart @ 2024-06-20 10:47 ` Thorsten Leemhuis 0 siblings, 0 replies; 107+ messages in thread From: Thorsten Leemhuis @ 2024-06-20 10:47 UTC (permalink / raw) To: Laurent Pinchart; +Cc: Mark Brown, Jiri Kosina, ksummit On 19.06.24 22:25, Laurent Pinchart wrote: > On Tue, Jun 18, 2024 at 02:58:38PM +0200, Thorsten Leemhuis wrote: >> On 14.06.24 16:10, Mark Brown wrote: >>> On Thu, Jun 13, 2024 at 01:39:00PM +0200, Jiri Kosina wrote: >>>> On Thu, 13 Jun 2024, Laurent Pinchart wrote: >>> >>>>> I don't think piling pressure will help. What could help is to reduce >>>>> pressure on already overloaded maintainers, to give them more time to >>>>> handle regressions. There have been multiple discussions about >>>>> co-maintainance models over the past few years, and some subsystems are >>>>> (slowly) moving forward. I would be more interested in participating in >>>>> that effort. >>> >>>> Fully agreed. That's exactly why a few days ago I proposed the topic about >>>> exploring the options of making the merge tree more deep (by delegating >>>> more and making the co-maintainership model more prominent), as that in my >>>> view is the only available solution to the current maintainer pressure >>>> problem. >>> >>> In my experience deeper maintainer trees are often a factor in slowing >>> down patches, passing things between maintainers often just inherently >>> adds delays even if nobody goes on holiday or whatever. >> >> From what I see from the regressions perspective they are not ideal >> either. The slow down is one problem, unless the process is streamlined >> well. Another one from my biased point of view seems to be that a few of >> are far away from Linus and apparently not fully aware how he wants >> regressions to be handled. >> >> Which is not really surprising, as over the years there were quite a few >> cases where maintainers of core subsystems were not handled well either. >> But sooner or later that resulted in a clash with Linus[1] and from then >> on things worked better. For many sub-subsystem something like that >> never happened -- and the maintainers of the higher level subsystem can >> not have their eyes everywhere, so they do not notice such problems or >> are more lax and friendly. > > This got me thinking: why don't we have trainings for maintainers, > instead of expecting people to decypher a combination of unwritten > rules, and written documentation containing conflicting and partly > outdated information ? Good point, but even if that training would exists it would be given by someone that deciphered "a combination of unwritten rules, and written documentation containing conflicting and partly outdated information" (as well as posts from Linus about the topic found on lore). At least unless we get Linus to give that training himself. :-) Which is why I try to write something down and get it blessed -- ideally by Linus, but if we a handful of core maintainers agree on it that good enough for me as well. > Of course, the whole path going from a first > submission to the kernel to maintaining a subsystem is some sort of > training (even if it looks like the kind of epic training the hero > ninja/warrior/sorcerer will undergo on their perilous journey to saving > the kingdom more than a process designed to optimize the end result), :-D > but at best that provides partial knowledge of the expected maintenance > process, +1 > without even mentioning the issue of keeping the knowledge up > to date. Once written down a simple diff will take care of that -- ideally of course coupled with some way to announcement the change to make people aware of it. But as I said somewhere else in this thread: writing things down has downsides as well (misinterpretations/misunderstandings, situations where ignoring them is for the greater good the right thing to do, ...). > It sometimes feels we're discussing how to improve the process > without even considering that many people are not even aware there is a > process. Ciao, Thorsten ^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [MAINTAINERS SUMMIT] [3/4] Elevate handling of regressions that made it to releases deemed for end users 2024-06-13 11:34 ` Laurent Pinchart 2024-06-13 11:39 ` Jiri Kosina @ 2024-06-13 15:56 ` Liam R. Howlett 2024-06-18 12:24 ` Thorsten Leemhuis 2 siblings, 0 replies; 107+ messages in thread From: Liam R. Howlett @ 2024-06-13 15:56 UTC (permalink / raw) To: Laurent Pinchart; +Cc: Thorsten Leemhuis, ksummit * Laurent Pinchart <laurent.pinchart@ideasonboard.com> [240613 07:35]: > Hi Thorsten, > > On Thu, Jun 13, 2024 at 10:34:17AM +0200, Thorsten Leemhuis wrote: > > I propose we elevate fixing of mainline regressions that made it to > > releases deemed for end users by setting a target to ideally mainline a > > fix (which might be a revert) within two weeks after the culprit was found. > > > > That IMHO would lessen one of the big pain points for users regarding > > regressions, as quite a few make it into proper release and then take > > quite a while to resolve (as shown in the scenario in the mail at the > > start of this thread). So much so that quite a few users afaics doubt > > that we take our "no regression" rule seriously. > > > > This is why I'd like to see such situations resolved even faster than > > regression that happen just in development kernels. "Expectations and > > best practices for fixing regressions" in > > Documentation/process/handling-regressions.rst (see [1/4] in this > > thread) kind of covers this already: > > > > """Expedite fixing mainline regressions that recently made it into a > > proper mainline, stable, or longterm release (either directly or via > > backport). [...] Aim to mainline a fix by Sunday after the next, if the > > culprit made it into a recent mainline, stable, or longterm release > > (either directly or via backport); if the culprit became known early > > during a week and is simple to resolve, try to mainline the fix within > > the same week. [...]""" > > > > I'd like to make the language somewhat stronger. > > > > """Handle mainline regressions that recently made it into a proper > > mainline, stable, or longterm release (either directly or via backport) > > with an even higher priority and try to fix them as fast as possible. > > [...] Aim hard to mainline a fix by Sunday after the next, if the > > Are we really telling people, some of them contributing in their spare > time, that they have to work during weekends ? > > I don't think piling pressure will help. What could help is to reduce > pressure on already overloaded maintainers, to give them more time to > handle regressions. There have been multiple discussions about > co-maintainance models over the past few years, and some subsystems are > (slowly) moving forward. I would be more interested in participating in > that effort. It otherwise feels like we would just add pressure on an > already overloaded system, without caring that the system has no > reasonable way to absorb that pressure. > > > culprit made it into a recent mainline, stable, or longterm release > > (either directly or via backport); try to mainline the fix within the > > same week, if the regression apparently bothers quite a few users or if > > the problem with the culprit became known on a Monday or Tuesday.""" Yes, this is the worst idea of all the really bad ideas in this set. They are so bad that it seems like the point is to actively damage the community. Either people will leave, won't join, or it will stall development in fears of retribution and/or punishment. Regards, Liam ^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [MAINTAINERS SUMMIT] [3/4] Elevate handling of regressions that made it to releases deemed for end users 2024-06-13 11:34 ` Laurent Pinchart 2024-06-13 11:39 ` Jiri Kosina 2024-06-13 15:56 ` Liam R. Howlett @ 2024-06-18 12:24 ` Thorsten Leemhuis 2024-06-20 13:20 ` Jani Nikula 2 siblings, 1 reply; 107+ messages in thread From: Thorsten Leemhuis @ 2024-06-18 12:24 UTC (permalink / raw) To: Laurent Pinchart; +Cc: ksummit On 13.06.24 13:34, Laurent Pinchart wrote: > On Thu, Jun 13, 2024 at 10:34:17AM +0200, Thorsten Leemhuis wrote: > >> I'd like to make the language somewhat stronger. >> >> """Handle mainline regressions that recently made it into a proper >> mainline, stable, or longterm release (either directly or via backport) >> with an even higher priority and try to fix them as fast as possible. >> [...] Aim hard to mainline a fix by Sunday after the next, if the > > Are we really telling people, some of them contributing in their spare > time, that they have to work during weekends ? To clarify: I'm not asking for that at all. The aim for Sunday is only here because Linus usually releases new -rc's on Sunday evenings, which quite a few people seem to use. So from the regressions point of view it's better to flush fixes to Linus late in the week (say on Friday -- or if you want on Sat or Sun, which some subsystem do), and not on a Monday, as people that use -rcs otherwise will run into the regression for yet another week -- and sometimes report it again, when the fix was just mainlined. What wording can avoid this? "By the end of the (current/next) week" maybe? In business context that afaik usually mean Fridays, but I'm not a native speaker, so might be wrong there. Ciao, Thorsten ^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [MAINTAINERS SUMMIT] [3/4] Elevate handling of regressions that made it to releases deemed for end users 2024-06-18 12:24 ` Thorsten Leemhuis @ 2024-06-20 13:20 ` Jani Nikula 2024-06-20 13:35 ` Thorsten Leemhuis 0 siblings, 1 reply; 107+ messages in thread From: Jani Nikula @ 2024-06-20 13:20 UTC (permalink / raw) To: Thorsten Leemhuis, Laurent Pinchart; +Cc: ksummit On Tue, 18 Jun 2024, Thorsten Leemhuis <linux@leemhuis.info> wrote: > On 13.06.24 13:34, Laurent Pinchart wrote: > >> On Thu, Jun 13, 2024 at 10:34:17AM +0200, Thorsten Leemhuis wrote: >> >>> I'd like to make the language somewhat stronger. >>> >>> """Handle mainline regressions that recently made it into a proper >>> mainline, stable, or longterm release (either directly or via backport) >>> with an even higher priority and try to fix them as fast as possible. >>> [...] Aim hard to mainline a fix by Sunday after the next, if the >> >> Are we really telling people, some of them contributing in their spare >> time, that they have to work during weekends ? > > To clarify: I'm not asking for that at all. The aim for Sunday is only > here because Linus usually releases new -rc's on Sunday evenings, which > quite a few people seem to use. So from the regressions point of view > it's better to flush fixes to Linus late in the week (say on Friday -- > or if you want on Sat or Sun, which some subsystem do), and not on a > Monday, as people that use -rcs otherwise will run into the regression > for yet another week -- and sometimes report it again, when the fix was > just mainlined. > > What wording can avoid this? "By the end of the (current/next) week" > maybe? In business context that afaik usually mean Fridays, but I'm not > a native speaker, so might be wrong there. Perhaps try wording it in terms of -rc/release instead of calendar? That's what we want, anyway, and it depends on the driver/subsystem how early you need to be to hit that target. See, adding another level of abstraction works in language in general, not just programming. ;) BR, Jani. > > Ciao, Thorsten > -- Jani Nikula, Intel ^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [MAINTAINERS SUMMIT] [3/4] Elevate handling of regressions that made it to releases deemed for end users 2024-06-20 13:20 ` Jani Nikula @ 2024-06-20 13:35 ` Thorsten Leemhuis 2024-06-20 14:16 ` Mark Brown 2024-06-21 6:47 ` Jiri Kosina 0 siblings, 2 replies; 107+ messages in thread From: Thorsten Leemhuis @ 2024-06-20 13:35 UTC (permalink / raw) To: Jani Nikula, Laurent Pinchart; +Cc: ksummit On 20.06.24 15:20, Jani Nikula wrote: > On Tue, 18 Jun 2024, Thorsten Leemhuis <linux@leemhuis.info> wrote: >> On 13.06.24 13:34, Laurent Pinchart wrote: >> >>> On Thu, Jun 13, 2024 at 10:34:17AM +0200, Thorsten Leemhuis wrote: >>> >>>> I'd like to make the language somewhat stronger. >>>> >>>> """Handle mainline regressions that recently made it into a proper >>>> mainline, stable, or longterm release (either directly or via backport) >>>> with an even higher priority and try to fix them as fast as possible. >>>> [...] Aim hard to mainline a fix by Sunday after the next, if the >>> >>> Are we really telling people, some of them contributing in their spare >>> time, that they have to work during weekends ? >> >> To clarify: I'm not asking for that at all. The aim for Sunday is only >> here because Linus usually releases new -rc's on Sunday evenings, which >> quite a few people seem to use. So from the regressions point of view >> it's better to flush fixes to Linus late in the week (say on Friday -- >> or if you want on Sat or Sun, which some subsystem do), and not on a >> Monday, as people that use -rcs otherwise will run into the regression >> for yet another week -- and sometimes report it again, when the fix was >> just mainlined. >> >> What wording can avoid this? "By the end of the (current/next) week" >> maybe? In business context that afaik usually mean Fridays, but I'm not >> a native speaker, so might be wrong there. > > Perhaps try wording it in terms of -rc/release instead of calendar? Not totally against that, but the thing is: in a earlier local draft it used to be like that. And then I noticed that this will add another week when it comes to the merge window. And for a mainline regressions that makes quite a difference at a (from the users perspective) crucial point of time: during and right after the window during witch distros like Arch Linux and openSUSE Tumbleweed usually switch to the newest mainline release or stable release derived from it (Fedora is usually a bit later). Ciao, Thorsten ^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [MAINTAINERS SUMMIT] [3/4] Elevate handling of regressions that made it to releases deemed for end users 2024-06-20 13:35 ` Thorsten Leemhuis @ 2024-06-20 14:16 ` Mark Brown 2024-06-21 6:47 ` Jiri Kosina 1 sibling, 0 replies; 107+ messages in thread From: Mark Brown @ 2024-06-20 14:16 UTC (permalink / raw) To: Thorsten Leemhuis; +Cc: Jani Nikula, Laurent Pinchart, ksummit [-- Attachment #1: Type: text/plain, Size: 971 bytes --] On Thu, Jun 20, 2024 at 03:35:05PM +0200, Thorsten Leemhuis wrote: > On 20.06.24 15:20, Jani Nikula wrote: > > On Tue, 18 Jun 2024, Thorsten Leemhuis <linux@leemhuis.info> wrote: > >> What wording can avoid this? "By the end of the (current/next) week" > >> maybe? In business context that afaik usually mean Fridays, but I'm not > >> a native speaker, so might be wrong there. > > Perhaps try wording it in terms of -rc/release instead of calendar? > Not totally against that, but the thing is: in a earlier local draft it > used to be like that. And then I noticed that this will add another week > when it comes to the merge window. I don't think rules lawyering the specific wording is going to make an enormous difference here, people are going to try to do something sensible anyway and the merge window is just different to the normal flow. You need something that's a suitable combination of comprehensible and not looking like unreasonable micromanagement. [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 488 bytes --] ^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [MAINTAINERS SUMMIT] [3/4] Elevate handling of regressions that made it to releases deemed for end users 2024-06-20 13:35 ` Thorsten Leemhuis 2024-06-20 14:16 ` Mark Brown @ 2024-06-21 6:47 ` Jiri Kosina 2024-06-21 10:19 ` Thorsten Leemhuis 1 sibling, 1 reply; 107+ messages in thread From: Jiri Kosina @ 2024-06-21 6:47 UTC (permalink / raw) To: Thorsten Leemhuis; +Cc: Jani Nikula, Laurent Pinchart, ksummit On Thu, 20 Jun 2024, Thorsten Leemhuis wrote: > Not totally against that, but the thing is: in a earlier local draft it > used to be like that. And then I noticed that this will add another week > when it comes to the merge window. I'll be repeating myself, but I personally really don't like any such strong timelines in kernel documentation. You are increasing the pressure on maintainers for no benefit. Everybody of course tries to get the regressions fixed as soon as possible once it's identified. Sometimes it takes longer, because it's complex. Sometimes people are on holidays. Sometimes it just falls in between cracks, and people need to be pinged to look into that. Kernel documentation wording is not going to stop any of that. Also it might be setting unrealistic expectations at the user/reporter side. -- Jiri Kosina SUSE Labs ^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [MAINTAINERS SUMMIT] [3/4] Elevate handling of regressions that made it to releases deemed for end users 2024-06-21 6:47 ` Jiri Kosina @ 2024-06-21 10:19 ` Thorsten Leemhuis 0 siblings, 0 replies; 107+ messages in thread From: Thorsten Leemhuis @ 2024-06-21 10:19 UTC (permalink / raw) To: Jiri Kosina; +Cc: Jani Nikula, Laurent Pinchart, ksummit On 21.06.24 08:47, Jiri Kosina wrote: > On Thu, 20 Jun 2024, Thorsten Leemhuis wrote: > > I'll be repeating myself, but I personally really don't like any such > strong timelines in kernel documentation. FWIW, I really tried hard to avoid *strong* wording, which is why the sections about timing use words like "aim for". > Everybody > of course tries to get the regressions fixed as soon as possible once it's > identified. We have a different perspective here, as I often see what the scenario at the start of the thread describes: a fix is developed soon (so up to that point we agree), but then take relative long to reach mainline (and thus affected stable series that contain the culprit, too). > Sometimes it takes longer, because it's complex. Sure, that's not a problem I care about here at all -- unless a revert could fix the regression quickly instead, especially if the culprit made it to a stable tree somehow. > Sometimes people are on > holidays. Sometimes it just falls in between cracks, and people need > to be pinged to look into that. This will always happen and that does not really bother me. > Kernel documentation wording is not going to stop any of that. Definitely not, and that's not the goal I'm trying to reach here. Dcumentation is meant to outline roughly what's expected to bring everyone on the same page, as right now we have different interpretations of posts from Linus people might or might not have seen coupled with a combination of unwritten rules, and written documentation containing conflicting and partly outdated information. Ciao, Thorsten ^ permalink raw reply [flat|nested] 107+ messages in thread
* [MAINTAINERS SUMMIT] [4/4] Discuss how to better prevent backports of commits that turn out to cause regressions 2024-06-13 8:22 [MAINTAINERS SUMMIT] [0/4] Common scenario for four proposals regarding regressions Thorsten Leemhuis ` (2 preceding siblings ...) 2024-06-13 8:34 ` [MAINTAINERS SUMMIT] [3/4] Elevate handling of regressions that made it to releases deemed for end users Thorsten Leemhuis @ 2024-06-13 8:42 ` Thorsten Leemhuis 2024-06-13 9:59 ` Jan Kara ` (3 more replies) 2024-06-18 14:43 ` [MAINTAINERS SUMMIT] [0/4] Common scenario for four proposals regarding regressions James Bottomley 4 siblings, 4 replies; 107+ messages in thread From: Thorsten Leemhuis @ 2024-06-13 8:42 UTC (permalink / raw) To: ksummit I would like to discuss how to better prevent backports of mainline commits to stable that turn out to cause regressions. The scenario shown at the start of the thread illustrates a problem I see frequently: commits with a Fixes: tag end up in new to stable series releases just days after being mainlined and cause regressions -- just like they do in mainline, which just was not known yet at the time of backporting. This happens extremely often right after merge windows when huge piles of changes are backported to the stable trees each cycle shortly after -rc1 is out (which even some kernel developers apparently are somewhat afraid to test from what I've seen). I do not want to criticize the stable team for their approach to things, as I can understand why they are doing this: there is no simple way to distinguish "this is an urgent (security) fix that should be quickly backported" from "this should be tested in mainline for a while first" or "this has a Fixes: tag, but is not backport-worthy at all" -- which is why they handle changes about equally. I think untangling that aspect and backporting the non-urgent ones more slowly could help a lot to prevent many regressions from hitting stable trees. The thing is: I'm not sure how to achieve that. Here are a few thoughts my brain came up with: * For patches that are tagged for backporting it's easy to for developers to influence the timing, as they can use a stable tag like `Cc: <stable@vger.kernel.org> # after -rc4` to delay backporting (see Documentation/process/stable-kernel-rules.rst for details). But for quite a few developers this is not an option, as such a Cc: implies that the developer wants the fix to be backported -- and thus should ideally have tested it, will provide an adjusted patch when needed, and is willing to handle any fallout. Maybe untangling these aspects from the stable tag might be wise, so that developers can signal "I think this should be backported, but I don't want anything to do with it" could help. If this becomes the norm then maybe the stable team could even stop taking nearly everything with a Fixes: tag. But I'm not sure if I like this idea myself, as it has downsides, too. * We could ask the stable team to only backport changes once they have been in mainline for a certain time (something like "at the earliest two weeks after the change was present in a mainline release or pre-release"?). But to not delay urgent fixes we then would need developers to mark the urgent ones somehow. That is likely a hard sell, but maybe less so then what the previous point outlined; untangling could help here, too. * Maybe convince the stable team to consider all commits with just a Fixes: tag as "non urgent", if they were merged during a merge window with a committer (or author?) date from before the merge window -- and then only backport them after -rc4 to ensure they got at least three weeks of mainline testing before they are backported. This is imperfect and has downsides, but would be relatively simple to realize. * We could extend the Fixes tag in a fashion similar to the stable tag (see above) to establish something like `Fixes: cafec0cacafe ("foo: bar: foobar baz") # after -rc4 if considered backportworthy` -- but some of these lines will become awfully long (they already are occasionally even without this add-on note). Ciao, Thorsten P.S.: Related things that could be discussed: * One cause of regressions that happen in stable trees (and not in mainline) I've seen quite a few times are backports of commits with Fixes: tags that were part of a patch-series and depend on earlier patches from the series. The stable-team afaics has no easy way to spot this, as there is no way to check "was this change part of a series". Sometimes I wonder if a dedicated tag linking to the submission of a patch could help -- and is something quite a few maintainers already really want and add using a "Link" tag despite Linus dislike for that (IIRC). But following that link for each and every patch slated for backporting does not scale for the stable team anyway, so it's likely not worth it. ^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [MAINTAINERS SUMMIT] [4/4] Discuss how to better prevent backports of commits that turn out to cause regressions 2024-06-13 8:42 ` [MAINTAINERS SUMMIT] [4/4] Discuss how to better prevent backports of commits that turn out to cause regressions Thorsten Leemhuis @ 2024-06-13 9:59 ` Jan Kara 2024-06-13 10:18 ` Thorsten Leemhuis ` (2 more replies) 2024-06-13 11:58 ` James Bottomley ` (2 subsequent siblings) 3 siblings, 3 replies; 107+ messages in thread From: Jan Kara @ 2024-06-13 9:59 UTC (permalink / raw) To: Thorsten Leemhuis; +Cc: ksummit On Thu 13-06-24 10:42:01, Thorsten Leemhuis wrote: > * One cause of regressions that happen in stable trees (and not in > mainline) I've seen quite a few times are backports of commits with > Fixes: tags that were part of a patch-series and depend on earlier > patches from the series. The stable-team afaics has no easy way to spot > this, as there is no way to check "was this change part of a series". > Sometimes I wonder if a dedicated tag linking to the submission of a > patch could help -- and is something quite a few maintainers already > really want and add using a "Link" tag despite Linus dislike for that > (IIRC). FWIW I (and a few other maintainers) use 'Message-Id' tag to link to submission. This is still easily convertible to lore link and unlike 'Link' tag it is clear what this tag is about and that it is not just a link to related discussion or something like that. AFAIK this also addresses Linus' dislike because what he was complaining about is that 'Link' should be linking to some useful context for the changelog, not just patch submission. > But following that link for each and every patch slated for > backporting does not scale for the stable team anyway, so it's likely > not worth it. Well, what I'd propose is that if 'Message-Id' tag is present and thus it can be established (in an automated way using lore) which series this patch was part of, then stable maintainers will either pick all patches from the start of the series upto this change or nothing. Because what I see happening several times in a year just in subsystems I maintain is that stable tree picks up more or less random subset of a patch series (depending on what applies and what their algorithms decide to take) and that causes issues. Sometimes we catch that during glancing over patches flowing into stable (like Amir did last week) but sometimes we don't and breakage happens. This will require a bit more discipline when creating patch series to put more or less independent fixes that should go into stable first but that is a good practice anyway and mostly followed at least in the areas of the kernel I work in. Honza -- Jan Kara <jack@suse.com> SUSE Labs, CR ^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [MAINTAINERS SUMMIT] [4/4] Discuss how to better prevent backports of commits that turn out to cause regressions 2024-06-13 9:59 ` Jan Kara @ 2024-06-13 10:18 ` Thorsten Leemhuis 2024-06-13 14:08 ` Konstantin Ryabitsev 2024-06-13 19:39 ` Dan Carpenter 2 siblings, 0 replies; 107+ messages in thread From: Thorsten Leemhuis @ 2024-06-13 10:18 UTC (permalink / raw) To: Jan Kara; +Cc: ksummit On 13.06.24 11:59, Jan Kara wrote: > On Thu 13-06-24 10:42:01, Thorsten Leemhuis wrote: >> * One cause of regressions that happen in stable trees (and not in >> mainline) I've seen quite a few times are backports of commits with >> Fixes: tags that were part of a patch-series and depend on earlier >> patches from the series. The stable-team afaics has no easy way to spot >> this, as there is no way to check "was this change part of a series". >> Sometimes I wonder if a dedicated tag linking to the submission of a >> patch could help -- and is something quite a few maintainers already >> really want and add using a "Link" tag despite Linus dislike for that >> (IIRC). > > FWIW I (and a few other maintainers) use 'Message-Id' tag to link to > submission. This is still easily convertible to lore link and unlike 'Link' > tag it is clear what this tag is about and that it is not just a link to > related discussion or something like that. AFAIK this also addresses Linus' > dislike because what he was complaining about is that 'Link' should be > linking to some useful context for the changelog, not just patch > submission. Well, I fine with me. But at the same time this makes me wonder: if we want to establish a new tag, why not just use something that makes it more obvious what the tag is about (e.g. "Review:", "Posting:", "Submission:", or something like that) and then just use a lore link for consistency, as we deal with those already all the time? Also makes it easier to follow later for anyone that looks closer at the commit and wants the bigger picture (e.g. cover letter and a list of related patches). >> But following that link for each and every patch slated for >> backporting does not scale for the stable team anyway, so it's likely >> not worth it. > > Well, what I'd propose is that if 'Message-Id' tag is present and thus it > can be established (in an automated way using lore) which series this patch > was part of, then stable maintainers will either pick all patches from the > start of the series upto this change or nothing. +1 (and this obviously could work about the same if it was a proper link) > Because what I see > happening several times in a year just in subsystems I maintain is that > stable tree picks up more or less random subset of a patch series > (depending on what applies and what their algorithms decide to take) and > that causes issues. Sometimes we catch that during glancing over patches > flowing into stable (like Amir did last week) but sometimes we don't and > breakage happens. > > This will require a bit more discipline when creating patch series to put > more or less independent fixes that should go into stable first but that is > a good practice anyway and mostly followed at least in the areas of the > kernel I work in. +1 Ciao, Thorsten ^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [MAINTAINERS SUMMIT] [4/4] Discuss how to better prevent backports of commits that turn out to cause regressions 2024-06-13 9:59 ` Jan Kara 2024-06-13 10:18 ` Thorsten Leemhuis @ 2024-06-13 14:08 ` Konstantin Ryabitsev 2024-06-14 9:19 ` Lee Jones 2024-06-14 14:29 ` Michael Ellerman 2024-06-13 19:39 ` Dan Carpenter 2 siblings, 2 replies; 107+ messages in thread From: Konstantin Ryabitsev @ 2024-06-13 14:08 UTC (permalink / raw) To: Jan Kara; +Cc: Thorsten Leemhuis, ksummit On Thu, Jun 13, 2024 at 11:59:17AM GMT, Jan Kara wrote: > > * One cause of regressions that happen in stable trees (and not in > > mainline) I've seen quite a few times are backports of commits with > > Fixes: tags that were part of a patch-series and depend on earlier > > patches from the series. The stable-team afaics has no easy way to spot > > this, as there is no way to check "was this change part of a series". > > Sometimes I wonder if a dedicated tag linking to the submission of a > > patch could help -- and is something quite a few maintainers already > > really want and add using a "Link" tag despite Linus dislike for that > > (IIRC). > > FWIW I (and a few other maintainers) use 'Message-Id' tag to link to > submission. This is still easily convertible to lore link and unlike 'Link' > tag it is clear what this tag is about and that it is not just a link to > related discussion or something like that. AFAIK this also addresses Linus' > dislike because what he was complaining about is that 'Link' should be > linking to some useful context for the changelog, not just patch > submission. I am strongly in favour of that from the tooling perspective. Linus suggested that we can always trace the original patch submission from the commit by using the patch-id, but that doesn't work reliably. I mused on that here: https://lore.kernel.org/git/20240605-hilarious-dramatic-mushroom-7fd941@lemur/ The gist is that we cannot reliably match the patch-id of the original submission from the git commit, because there are multiple ways to generate the same patch, such as changing the diff algorithm (myers vs. minimal vs. histogram), or changing the number of context lines. If the original author generated their patch with --histogram, but we try to find it by generating the same patch using the default myers algorithm, we may not find it. The "Message-Id" trailer is already documented in git: https://www.git-scm.com/docs/git-am#Documentation/git-am.txt---message-id I suggest we move away from the practice of using Link: trailers to indicate the patch provenance to using Message-Id: trailers for the same purpose. This solves multiple problems: 1. disambiguates Link: trailers so they point to relevant online discussions 2. allows tooling like b4, patchwork, etc, to reliably match commits to submissions for the purposes of better code review automation 3. allows stable and similar projects to better track series grouping for commits -K ^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [MAINTAINERS SUMMIT] [4/4] Discuss how to better prevent backports of commits that turn out to cause regressions 2024-06-13 14:08 ` Konstantin Ryabitsev @ 2024-06-14 9:19 ` Lee Jones 2024-06-14 9:24 ` Lee Jones 2024-06-14 12:27 ` Konstantin Ryabitsev 2024-06-14 14:29 ` Michael Ellerman 1 sibling, 2 replies; 107+ messages in thread From: Lee Jones @ 2024-06-14 9:19 UTC (permalink / raw) To: Konstantin Ryabitsev; +Cc: Jan Kara, Thorsten Leemhuis, ksummit On Thu, 13 Jun 2024, Konstantin Ryabitsev wrote: > On Thu, Jun 13, 2024 at 11:59:17AM GMT, Jan Kara wrote: > > > * One cause of regressions that happen in stable trees (and not in > > > mainline) I've seen quite a few times are backports of commits with > > > Fixes: tags that were part of a patch-series and depend on earlier > > > patches from the series. The stable-team afaics has no easy way to spot > > > this, as there is no way to check "was this change part of a series". > > > Sometimes I wonder if a dedicated tag linking to the submission of a > > > patch could help -- and is something quite a few maintainers already > > > really want and add using a "Link" tag despite Linus dislike for that > > > (IIRC). > > > > FWIW I (and a few other maintainers) use 'Message-Id' tag to link to > > submission. This is still easily convertible to lore link and unlike 'Link' > > tag it is clear what this tag is about and that it is not just a link to > > related discussion or something like that. AFAIK this also addresses Linus' > > dislike because what he was complaining about is that 'Link' should be > > linking to some useful context for the changelog, not just patch > > submission. > > I am strongly in favour of that from the tooling perspective. Linus suggested > that we can always trace the original patch submission from the commit by > using the patch-id, but that doesn't work reliably. I mused on that here: > > https://lore.kernel.org/git/20240605-hilarious-dramatic-mushroom-7fd941@lemur/ > > The gist is that we cannot reliably match the patch-id of the original > submission from the git commit, because there are multiple ways to generate > the same patch, such as changing the diff algorithm (myers vs. minimal vs. > histogram), or changing the number of context lines. If the original author > generated their patch with --histogram, but we try to find it by generating > the same patch using the default myers algorithm, we may not find it. > > The "Message-Id" trailer is already documented in git: > https://www.git-scm.com/docs/git-am#Documentation/git-am.txt---message-id > > I suggest we move away from the practice of using Link: trailers to indicate > the patch provenance to using Message-Id: trailers for the same purpose. This > solves multiple problems: > > 1. disambiguates Link: trailers so they point to relevant online discussions > 2. allows tooling like b4, patchwork, etc, to reliably match commits to > submissions for the purposes of better code review automation > 3. allows stable and similar projects to better track series grouping for > commits Sounds good to me. So `b4 am -l` should be replaced with `b4 am ?`. -- Lee Jones [李琼斯] ^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [MAINTAINERS SUMMIT] [4/4] Discuss how to better prevent backports of commits that turn out to cause regressions 2024-06-14 9:19 ` Lee Jones @ 2024-06-14 9:24 ` Lee Jones 2024-06-14 12:27 ` Konstantin Ryabitsev 1 sibling, 0 replies; 107+ messages in thread From: Lee Jones @ 2024-06-14 9:24 UTC (permalink / raw) To: Konstantin Ryabitsev; +Cc: Jan Kara, Thorsten Leemhuis, ksummit On Fri, 14 Jun 2024, Lee Jones wrote: > On Thu, 13 Jun 2024, Konstantin Ryabitsev wrote: > > > On Thu, Jun 13, 2024 at 11:59:17AM GMT, Jan Kara wrote: > > > > * One cause of regressions that happen in stable trees (and not in > > > > mainline) I've seen quite a few times are backports of commits with > > > > Fixes: tags that were part of a patch-series and depend on earlier > > > > patches from the series. The stable-team afaics has no easy way to spot > > > > this, as there is no way to check "was this change part of a series". > > > > Sometimes I wonder if a dedicated tag linking to the submission of a > > > > patch could help -- and is something quite a few maintainers already > > > > really want and add using a "Link" tag despite Linus dislike for that > > > > (IIRC). > > > > > > FWIW I (and a few other maintainers) use 'Message-Id' tag to link to > > > submission. This is still easily convertible to lore link and unlike 'Link' > > > tag it is clear what this tag is about and that it is not just a link to > > > related discussion or something like that. AFAIK this also addresses Linus' > > > dislike because what he was complaining about is that 'Link' should be > > > linking to some useful context for the changelog, not just patch > > > submission. > > > > I am strongly in favour of that from the tooling perspective. Linus suggested > > that we can always trace the original patch submission from the commit by > > using the patch-id, but that doesn't work reliably. I mused on that here: > > > > https://lore.kernel.org/git/20240605-hilarious-dramatic-mushroom-7fd941@lemur/ > > > > The gist is that we cannot reliably match the patch-id of the original > > submission from the git commit, because there are multiple ways to generate > > the same patch, such as changing the diff algorithm (myers vs. minimal vs. > > histogram), or changing the number of context lines. If the original author > > generated their patch with --histogram, but we try to find it by generating > > the same patch using the default myers algorithm, we may not find it. > > > > The "Message-Id" trailer is already documented in git: > > https://www.git-scm.com/docs/git-am#Documentation/git-am.txt---message-id > > > > I suggest we move away from the practice of using Link: trailers to indicate > > the patch provenance to using Message-Id: trailers for the same purpose. This > > solves multiple problems: > > > > 1. disambiguates Link: trailers so they point to relevant online discussions > > 2. allows tooling like b4, patchwork, etc, to reliably match commits to > > submissions for the purposes of better code review automation > > 3. allows stable and similar projects to better track series grouping for > > commits > > Sounds good to me. > > So `b4 am -l` should be replaced with `b4 am ?`. Actually it looks as though I have an avenue for this already: print_green "\nFetching patch(es)" b4 am -3 -slt ${PATCHES} -o - ${id} > ${MBOX} print_green "\nRunning through checkpatch.pl" cat ${MBOX} | formail -ds ./scripts/checkpatch.pl || true print_blue "\nCheck the results (hit return to continue or Ctrl+c to exit)" read -u 1 print_green "\nApplying patch(es)" cat ${MBOX} | git am -3 --reject --message-id #### <--- HERE if [ ${?} != 0 ]; then print_red "\nFailed to apply patches (fix and either hit return to continue or Ctrl+c to exit)" read -u 1 fi Might be nicer to build it right into `b4` though. -- Lee Jones [李琼斯] ^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [MAINTAINERS SUMMIT] [4/4] Discuss how to better prevent backports of commits that turn out to cause regressions 2024-06-14 9:19 ` Lee Jones 2024-06-14 9:24 ` Lee Jones @ 2024-06-14 12:27 ` Konstantin Ryabitsev 2024-06-14 14:26 ` Konstantin Ryabitsev 1 sibling, 1 reply; 107+ messages in thread From: Konstantin Ryabitsev @ 2024-06-14 12:27 UTC (permalink / raw) To: Lee Jones; +Cc: Jan Kara, Thorsten Leemhuis, ksummit On Fri, Jun 14, 2024 at 10:19:49AM GMT, Lee Jones wrote: > > 1. disambiguates Link: trailers so they point to relevant online discussions > > 2. allows tooling like b4, patchwork, etc, to reliably match commits to > > submissions for the purposes of better code review automation > > 3. allows stable and similar projects to better track series grouping for > > commits > > Sounds good to me. > > So `b4 am -l` should be replaced with `b4 am ?`. You can still use -l for this by adding this to .gitconfig: [b4] linktrailermask = Message-Id: <%s> -K ^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [MAINTAINERS SUMMIT] [4/4] Discuss how to better prevent backports of commits that turn out to cause regressions 2024-06-14 12:27 ` Konstantin Ryabitsev @ 2024-06-14 14:26 ` Konstantin Ryabitsev 2024-06-14 14:36 ` Lee Jones 0 siblings, 1 reply; 107+ messages in thread From: Konstantin Ryabitsev @ 2024-06-14 14:26 UTC (permalink / raw) To: Lee Jones; +Cc: Jan Kara, Thorsten Leemhuis, ksummit On Fri, Jun 14, 2024 at 08:27:30AM GMT, Konstantin Ryabitsev wrote: > On Fri, Jun 14, 2024 at 10:19:49AM GMT, Lee Jones wrote: > > > 1. disambiguates Link: trailers so they point to relevant online discussions > > > 2. allows tooling like b4, patchwork, etc, to reliably match commits to > > > submissions for the purposes of better code review automation > > > 3. allows stable and similar projects to better track series grouping for > > > commits > > > > Sounds good to me. > > > > So `b4 am -l` should be replaced with `b4 am ?`. > > You can still use -l for this by adding this to .gitconfig: > > [b4] > linktrailermask = Message-Id: <%s> I also just added the -i flag, which will make it into b4-0.14, so you'll be able to run "b4 am -i" to insert the Message-ID: trailer instead of the Link: trailer. -K ^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [MAINTAINERS SUMMIT] [4/4] Discuss how to better prevent backports of commits that turn out to cause regressions 2024-06-14 14:26 ` Konstantin Ryabitsev @ 2024-06-14 14:36 ` Lee Jones 0 siblings, 0 replies; 107+ messages in thread From: Lee Jones @ 2024-06-14 14:36 UTC (permalink / raw) To: Konstantin Ryabitsev; +Cc: Jan Kara, Thorsten Leemhuis, ksummit On Fri, 14 Jun 2024, Konstantin Ryabitsev wrote: > On Fri, Jun 14, 2024 at 08:27:30AM GMT, Konstantin Ryabitsev wrote: > > On Fri, Jun 14, 2024 at 10:19:49AM GMT, Lee Jones wrote: > > > > 1. disambiguates Link: trailers so they point to relevant online discussions > > > > 2. allows tooling like b4, patchwork, etc, to reliably match commits to > > > > submissions for the purposes of better code review automation > > > > 3. allows stable and similar projects to better track series grouping for > > > > commits > > > > > > Sounds good to me. > > > > > > So `b4 am -l` should be replaced with `b4 am ?`. > > > > You can still use -l for this by adding this to .gitconfig: > > > > [b4] > > linktrailermask = Message-Id: <%s> > > I also just added the -i flag, which will make it into b4-0.14, so you'll be > able to run "b4 am -i" to insert the Message-ID: trailer instead of the Link: > trailer. I might keep both. It'll give Michael something to click on. -- Lee Jones [李琼斯] ^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [MAINTAINERS SUMMIT] [4/4] Discuss how to better prevent backports of commits that turn out to cause regressions 2024-06-13 14:08 ` Konstantin Ryabitsev 2024-06-14 9:19 ` Lee Jones @ 2024-06-14 14:29 ` Michael Ellerman 2024-06-14 14:38 ` Konstantin Ryabitsev ` (3 more replies) 1 sibling, 4 replies; 107+ messages in thread From: Michael Ellerman @ 2024-06-14 14:29 UTC (permalink / raw) To: Konstantin Ryabitsev, Jan Kara; +Cc: Thorsten Leemhuis, ksummit Konstantin Ryabitsev <konstantin@linuxfoundation.org> writes: > On Thu, Jun 13, 2024 at 11:59:17AM GMT, Jan Kara wrote: >> > * One cause of regressions that happen in stable trees (and not in >> > mainline) I've seen quite a few times are backports of commits with >> > Fixes: tags that were part of a patch-series and depend on earlier >> > patches from the series. The stable-team afaics has no easy way to spot >> > this, as there is no way to check "was this change part of a series". >> > Sometimes I wonder if a dedicated tag linking to the submission of a >> > patch could help -- and is something quite a few maintainers already >> > really want and add using a "Link" tag despite Linus dislike for that >> > (IIRC). >> >> FWIW I (and a few other maintainers) use 'Message-Id' tag to link to >> submission. This is still easily convertible to lore link and unlike 'Link' >> tag it is clear what this tag is about and that it is not just a link to >> related discussion or something like that. AFAIK this also addresses Linus' >> dislike because what he was complaining about is that 'Link' should be >> linking to some useful context for the changelog, not just patch >> submission. > > I am strongly in favour of that from the tooling perspective. Linus suggested > that we can always trace the original patch submission from the commit by > using the patch-id, but that doesn't work reliably. I mused on that here: > > https://lore.kernel.org/git/20240605-hilarious-dramatic-mushroom-7fd941@lemur/ > > The gist is that we cannot reliably match the patch-id of the original > submission from the git commit, because there are multiple ways to generate > the same patch, such as changing the diff algorithm (myers vs. minimal vs. > histogram), or changing the number of context lines. If the original author > generated their patch with --histogram, but we try to find it by generating > the same patch using the default myers algorithm, we may not find it. > > The "Message-Id" trailer is already documented in git: > https://www.git-scm.com/docs/git-am#Documentation/git-am.txt---message-id > > I suggest we move away from the practice of using Link: trailers to indicate > the patch provenance to using Message-Id: trailers for the same purpose. This > solves multiple problems: > > 1. disambiguates Link: trailers so they point to relevant online discussions > 2. allows tooling like b4, patchwork, etc, to reliably match commits to > submissions for the purposes of better code review automation > 3. allows stable and similar projects to better track series grouping for > commits Message-Id: sucks, I want a link I can open with a single click. At your suggestion I switched to using https://msgid.link/ as the target for patch links, eg: Link: https://msgid.link/20240529123029.146953-2-mpe@ellerman.id.au Which gives the reader a hint that the link is just to the submission. I don't really care if the tag is "Link:", but it has to be a URL, not just a bare message-id that I have to cut and paste like it's the stone age. cheers ^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [MAINTAINERS SUMMIT] [4/4] Discuss how to better prevent backports of commits that turn out to cause regressions 2024-06-14 14:29 ` Michael Ellerman @ 2024-06-14 14:38 ` Konstantin Ryabitsev 2024-06-14 14:44 ` Rafael J. Wysocki ` (2 more replies) 2024-06-14 14:43 ` Mark Brown ` (2 subsequent siblings) 3 siblings, 3 replies; 107+ messages in thread From: Konstantin Ryabitsev @ 2024-06-14 14:38 UTC (permalink / raw) To: Michael Ellerman; +Cc: Jan Kara, Thorsten Leemhuis, ksummit On Sat, Jun 15, 2024 at 12:29:09AM GMT, Michael Ellerman wrote: > > 1. disambiguates Link: trailers so they point to relevant online discussions > > 2. allows tooling like b4, patchwork, etc, to reliably match commits to > > submissions for the purposes of better code review automation > > 3. allows stable and similar projects to better track series grouping for > > commits > > Message-Id: sucks, I want a link I can open with a single click. But why would you want to, on a regular basis? Viewing the series submission has got to provide near zero useful info -- if it was accepted into the tree, then at most it would have a couple of stray code review trailers. > At your suggestion I switched to using https://msgid.link/ as the target > for patch links, eg: > > Link: https://msgid.link/20240529123029.146953-2-mpe@ellerman.id.au This is still my recommendation, but it doesn't stop someone from using msgid.link URLs to link to actual discussions. It also doesn't solve the problem being discussed here -- reliably mapping commits to patch submissions for the purposes of automation. -K ^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [MAINTAINERS SUMMIT] [4/4] Discuss how to better prevent backports of commits that turn out to cause regressions 2024-06-14 14:38 ` Konstantin Ryabitsev @ 2024-06-14 14:44 ` Rafael J. Wysocki 2024-06-14 15:08 ` Geert Uytterhoeven 2024-06-14 15:45 ` Mark Brown 2 siblings, 0 replies; 107+ messages in thread From: Rafael J. Wysocki @ 2024-06-14 14:44 UTC (permalink / raw) To: Konstantin Ryabitsev Cc: Michael Ellerman, Jan Kara, Thorsten Leemhuis, ksummit On Fri, Jun 14, 2024 at 4:38 PM Konstantin Ryabitsev <konstantin@linuxfoundation.org> wrote: > > On Sat, Jun 15, 2024 at 12:29:09AM GMT, Michael Ellerman wrote: > > > 1. disambiguates Link: trailers so they point to relevant online discussions > > > 2. allows tooling like b4, patchwork, etc, to reliably match commits to > > > submissions for the purposes of better code review automation > > > 3. allows stable and similar projects to better track series grouping for > > > commits > > > > Message-Id: sucks, I want a link I can open with a single click. > > But why would you want to, on a regular basis? Viewing the series submission > has got to provide near zero useful info -- if it was accepted into the tree, > then at most it would have a couple of stray code review trailers. > > > At your suggestion I switched to using https://msgid.link/ as the target > > for patch links, eg: > > > > Link: https://msgid.link/20240529123029.146953-2-mpe@ellerman.id.au > > This is still my recommendation, but it doesn't stop someone from using > msgid.link URLs to link to actual discussions. It also doesn't solve the > problem being discussed here -- reliably mapping commits to patch submissions > for the purposes of automation. IMO anything short of a new tag wouldn't help. With a new tag, say Submission:, pointing to the original patch at Lore, this could be addressed. Of course, people would need to see a clear benefit to adopt it because it would require some extra work to use. ^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [MAINTAINERS SUMMIT] [4/4] Discuss how to better prevent backports of commits that turn out to cause regressions 2024-06-14 14:38 ` Konstantin Ryabitsev 2024-06-14 14:44 ` Rafael J. Wysocki @ 2024-06-14 15:08 ` Geert Uytterhoeven 2024-06-15 11:29 ` Michael Ellerman 2024-06-17 10:15 ` Jani Nikula 2024-06-14 15:45 ` Mark Brown 2 siblings, 2 replies; 107+ messages in thread From: Geert Uytterhoeven @ 2024-06-14 15:08 UTC (permalink / raw) To: Konstantin Ryabitsev Cc: Michael Ellerman, Jan Kara, Thorsten Leemhuis, ksummit Hi Konstantin, On Fri, Jun 14, 2024 at 4:38 PM Konstantin Ryabitsev <konstantin@linuxfoundation.org> wrote: > On Sat, Jun 15, 2024 at 12:29:09AM GMT, Michael Ellerman wrote: > > > 1. disambiguates Link: trailers so they point to relevant online discussions > > > 2. allows tooling like b4, patchwork, etc, to reliably match commits to > > > submissions for the purposes of better code review automation > > > 3. allows stable and similar projects to better track series grouping for > > > commits > > > > Message-Id: sucks, I want a link I can open with a single click. > > But why would you want to, on a regular basis? Viewing the series submission > has got to provide near zero useful info -- if it was accepted into the tree, > then at most it would have a couple of stray code review trailers. I open these links regularly (as in daily), for various reasons: - Finding the thread to reply to when reporting a bug, - Checking for new Rb-tags given, - As a starting point for reading earlier submissions of the same patch, - ... Lots of tools (gnome-terminal, mate-terminal, gitk, cgit, ...) support opening URLs in commit logs at the blink of an eye. Having just a Message-Id means more work (yes, I have lore configured as a search engine in my browser). That's also why I detest people putting patchwork URLs instead of lore URLs in the Link:-tag: finding the thread in lore requires one more click on "mailing list archive" (for patchwork.kernel.org) or a copy-'n-paste of the Message-Id (for oh-the-horror patchwork.freedesktop.org; and what if freedesktop.org goes away?) Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds ^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [MAINTAINERS SUMMIT] [4/4] Discuss how to better prevent backports of commits that turn out to cause regressions 2024-06-14 15:08 ` Geert Uytterhoeven @ 2024-06-15 11:29 ` Michael Ellerman 2024-06-17 10:15 ` Jani Nikula 1 sibling, 0 replies; 107+ messages in thread From: Michael Ellerman @ 2024-06-15 11:29 UTC (permalink / raw) To: Geert Uytterhoeven, Konstantin Ryabitsev Cc: Jan Kara, Thorsten Leemhuis, ksummit Geert Uytterhoeven <geert@linux-m68k.org> writes: > Hi Konstantin, > > On Fri, Jun 14, 2024 at 4:38 PM Konstantin Ryabitsev > <konstantin@linuxfoundation.org> wrote: >> On Sat, Jun 15, 2024 at 12:29:09AM GMT, Michael Ellerman wrote: >> > > 1. disambiguates Link: trailers so they point to relevant online discussions >> > > 2. allows tooling like b4, patchwork, etc, to reliably match commits to >> > > submissions for the purposes of better code review automation >> > > 3. allows stable and similar projects to better track series grouping for >> > > commits >> > >> > Message-Id: sucks, I want a link I can open with a single click. >> >> But why would you want to, on a regular basis? Viewing the series submission >> has got to provide near zero useful info -- if it was accepted into the tree, >> then at most it would have a couple of stray code review trailers. > > I open these links regularly (as in daily), for various reasons: > - Finding the thread to reply to when reporting a bug, > - Checking for new Rb-tags given, > - As a starting point for reading earlier submissions of the > same patch, > - ... Yep, all those. I also use them to check that the version of a patch I have committed locally in my testing tree is still the latest submission, before pushing. Because the time between applying a patch and pushing it can be days if it needs lots of testing. Another case from yesterday was a ~3 year old patch, someone was asking about backporting. I opened the link to the submission and saw that someone had replied to it reporting a bug (that I had forgotten about). cheers ^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [MAINTAINERS SUMMIT] [4/4] Discuss how to better prevent backports of commits that turn out to cause regressions 2024-06-14 15:08 ` Geert Uytterhoeven 2024-06-15 11:29 ` Michael Ellerman @ 2024-06-17 10:15 ` Jani Nikula 2024-06-17 12:42 ` Geert Uytterhoeven 1 sibling, 1 reply; 107+ messages in thread From: Jani Nikula @ 2024-06-17 10:15 UTC (permalink / raw) To: Geert Uytterhoeven, Konstantin Ryabitsev Cc: Michael Ellerman, Jan Kara, Thorsten Leemhuis, ksummit On Fri, 14 Jun 2024, Geert Uytterhoeven <geert@linux-m68k.org> wrote: > That's also why I detest people putting patchwork URLs instead of > lore URLs in the Link:-tag: finding the thread in lore requires one > more click on "mailing list archive" (for patchwork.kernel.org) > or a copy-'n-paste of the Message-Id (for oh-the-horror > patchwork.freedesktop.org; and what if freedesktop.org goes away?) More than 99% of the Link: tags pointing at patchwork.freedesktop.org have Message-ID in the URL. BR, Jani. -- Jani Nikula, Intel ^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [MAINTAINERS SUMMIT] [4/4] Discuss how to better prevent backports of commits that turn out to cause regressions 2024-06-17 10:15 ` Jani Nikula @ 2024-06-17 12:42 ` Geert Uytterhoeven 0 siblings, 0 replies; 107+ messages in thread From: Geert Uytterhoeven @ 2024-06-17 12:42 UTC (permalink / raw) To: Jani Nikula Cc: Konstantin Ryabitsev, Michael Ellerman, Jan Kara, Thorsten Leemhuis, ksummit Hi Jani, On Mon, Jun 17, 2024 at 12:15 PM Jani Nikula <jani.nikula@intel.com> wrote: > On Fri, 14 Jun 2024, Geert Uytterhoeven <geert@linux-m68k.org> wrote: > > That's also why I detest people putting patchwork URLs instead of > > lore URLs in the Link:-tag: finding the thread in lore requires one > > more click on "mailing list archive" (for patchwork.kernel.org) > > or a copy-'n-paste of the Message-Id (for oh-the-horror > > patchwork.freedesktop.org; and what if freedesktop.org goes away?) > > More than 99% of the Link: tags pointing at patchwork.freedesktop.org > have Message-ID in the URL. Sure, I can extract the Message-ID from either the URL or the patchwork web page. But it requires more work from my[*] side (compared to commits from all^Wmost other subsystems). [*] Optimize for the (many) readers, not for the (single) writer. Thanks! Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds ^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [MAINTAINERS SUMMIT] [4/4] Discuss how to better prevent backports of commits that turn out to cause regressions 2024-06-14 14:38 ` Konstantin Ryabitsev 2024-06-14 14:44 ` Rafael J. Wysocki 2024-06-14 15:08 ` Geert Uytterhoeven @ 2024-06-14 15:45 ` Mark Brown 2 siblings, 0 replies; 107+ messages in thread From: Mark Brown @ 2024-06-14 15:45 UTC (permalink / raw) To: Konstantin Ryabitsev Cc: Michael Ellerman, Jan Kara, Thorsten Leemhuis, ksummit [-- Attachment #1: Type: text/plain, Size: 732 bytes --] On Fri, Jun 14, 2024 at 10:38:12AM -0400, Konstantin Ryabitsev wrote: > On Sat, Jun 15, 2024 at 12:29:09AM GMT, Michael Ellerman wrote: > > Message-Id: sucks, I want a link I can open with a single click. > But why would you want to, on a regular basis? Viewing the series submission > has got to provide near zero useful info -- if it was accepted into the tree, > then at most it would have a couple of stray code review trailers. The two scenarios I run into really often are when doing archeology to try to figure out if there's any extra context for a patch (eg, a wider series it was embedded in) or when some problem has been found in CI and I want to figure out who to tell about it or if anyone else saw similar issues. [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 488 bytes --] ^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [MAINTAINERS SUMMIT] [4/4] Discuss how to better prevent backports of commits that turn out to cause regressions 2024-06-14 14:29 ` Michael Ellerman 2024-06-14 14:38 ` Konstantin Ryabitsev @ 2024-06-14 14:43 ` Mark Brown 2024-06-14 14:51 ` Konstantin Ryabitsev 2024-06-14 14:43 ` Steven Rostedt 2024-06-16 1:13 ` Linus Torvalds 3 siblings, 1 reply; 107+ messages in thread From: Mark Brown @ 2024-06-14 14:43 UTC (permalink / raw) To: Michael Ellerman Cc: Konstantin Ryabitsev, Jan Kara, Thorsten Leemhuis, ksummit [-- Attachment #1: Type: text/plain, Size: 1421 bytes --] On Sat, Jun 15, 2024 at 12:29:09AM +1000, Michael Ellerman wrote: > Konstantin Ryabitsev <konstantin@linuxfoundation.org> writes: > > I suggest we move away from the practice of using Link: trailers to indicate > > the patch provenance to using Message-Id: trailers for the same purpose. This > > solves multiple problems: > > 1. disambiguates Link: trailers so they point to relevant online discussions > > 2. allows tooling like b4, patchwork, etc, to reliably match commits to > > submissions for the purposes of better code review automation > > 3. allows stable and similar projects to better track series grouping for > > commits > Message-Id: sucks, I want a link I can open with a single click. > At your suggestion I switched to using https://msgid.link/ as the target > for patch links, eg: > Link: https://msgid.link/20240529123029.146953-2-mpe@ellerman.id.au > Which gives the reader a hint that the link is just to the submission. > I don't really care if the tag is "Link:", but it has to be a URL, not > just a bare message-id that I have to cut and paste like it's the stone > age. Actually now that you mention it some terminals (GNOME I think?) have a feature where they'll identify strings with an @ in them as e-mail addresses and if you click on one they'll try to fire up some GUI mail client with a new e-mail addressed to that. This interacts poorly with using message IDs a lot. [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 488 bytes --] ^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [MAINTAINERS SUMMIT] [4/4] Discuss how to better prevent backports of commits that turn out to cause regressions 2024-06-14 14:43 ` Mark Brown @ 2024-06-14 14:51 ` Konstantin Ryabitsev 2024-06-14 15:42 ` Mark Brown 0 siblings, 1 reply; 107+ messages in thread From: Konstantin Ryabitsev @ 2024-06-14 14:51 UTC (permalink / raw) To: Mark Brown; +Cc: Michael Ellerman, Jan Kara, Thorsten Leemhuis, ksummit On Fri, Jun 14, 2024 at 03:43:18PM GMT, Mark Brown wrote: > > Message-Id: sucks, I want a link I can open with a single click. > > > At your suggestion I switched to using https://msgid.link/ as the target > > for patch links, eg: > > > Link: https://msgid.link/20240529123029.146953-2-mpe@ellerman.id.au > > > Which gives the reader a hint that the link is just to the submission. > > > I don't really care if the tag is "Link:", but it has to be a URL, not > > just a bare message-id that I have to cut and paste like it's the stone > > age. > > Actually now that you mention it some terminals (GNOME I think?) have a > feature where they'll identify strings with an @ in them as e-mail > addresses and if you click on one they'll try to fire up some GUI mail > client with a new e-mail addressed to that. This interacts poorly with > using message IDs a lot. Yeah, but same would happen if you accidentally click on anyone's email address in the trailers, so I'm not sure how this has any different impact? -K ^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [MAINTAINERS SUMMIT] [4/4] Discuss how to better prevent backports of commits that turn out to cause regressions 2024-06-14 14:51 ` Konstantin Ryabitsev @ 2024-06-14 15:42 ` Mark Brown 0 siblings, 0 replies; 107+ messages in thread From: Mark Brown @ 2024-06-14 15:42 UTC (permalink / raw) To: Konstantin Ryabitsev Cc: Michael Ellerman, Jan Kara, Thorsten Leemhuis, ksummit [-- Attachment #1: Type: text/plain, Size: 829 bytes --] On Fri, Jun 14, 2024 at 10:51:53AM -0400, Konstantin Ryabitsev wrote: > On Fri, Jun 14, 2024 at 03:43:18PM GMT, Mark Brown wrote: > > Actually now that you mention it some terminals (GNOME I think?) have a > > feature where they'll identify strings with an @ in them as e-mail > > addresses and if you click on one they'll try to fire up some GUI mail > > client with a new e-mail addressed to that. This interacts poorly with > > using message IDs a lot. > Yeah, but same would happen if you accidentally click on anyone's email > address in the trailers, so I'm not sure how this has any different impact? The suggested workflow was to go cut'n'pasting message IDs when using them, it's a lot more likely that you'll click on something you're actively trying to cut'n'paste individually than a random part of the trailers. [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 488 bytes --] ^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [MAINTAINERS SUMMIT] [4/4] Discuss how to better prevent backports of commits that turn out to cause regressions 2024-06-14 14:29 ` Michael Ellerman 2024-06-14 14:38 ` Konstantin Ryabitsev 2024-06-14 14:43 ` Mark Brown @ 2024-06-14 14:43 ` Steven Rostedt 2024-06-14 14:57 ` Laurent Pinchart 2024-06-16 1:13 ` Linus Torvalds 3 siblings, 1 reply; 107+ messages in thread From: Steven Rostedt @ 2024-06-14 14:43 UTC (permalink / raw) To: Michael Ellerman Cc: Konstantin Ryabitsev, Jan Kara, Thorsten Leemhuis, ksummit On Sat, 15 Jun 2024 00:29:09 +1000 Michael Ellerman <mpe@ellerman.id.au> wrote: > Message-Id: sucks, I want a link I can open with a single click. > > At your suggestion I switched to using https://msgid.link/ as the target > for patch links, eg: > > Link: https://msgid.link/20240529123029.146953-2-mpe@ellerman.id.au > > Which gives the reader a hint that the link is just to the submission. > > I don't really care if the tag is "Link:", but it has to be a URL, not > just a bare message-id that I have to cut and paste like it's the stone > age. I just switch my scripts over to Message-Id: and applied it, and after playing with it a little, I agree with the above sentiment. I like having a link to the actual patch that I can just click on. The message-id adds more steps to get there. I'm going back to the link, but I agree with others, "Link:" should be for the discussions. Perhaps we could use "Pulled-from:" ? -- Steve ^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [MAINTAINERS SUMMIT] [4/4] Discuss how to better prevent backports of commits that turn out to cause regressions 2024-06-14 14:43 ` Steven Rostedt @ 2024-06-14 14:57 ` Laurent Pinchart 0 siblings, 0 replies; 107+ messages in thread From: Laurent Pinchart @ 2024-06-14 14:57 UTC (permalink / raw) To: Steven Rostedt Cc: Michael Ellerman, Konstantin Ryabitsev, Jan Kara, Thorsten Leemhuis, ksummit On Fri, Jun 14, 2024 at 10:43:51AM -0400, Steven Rostedt wrote: > On Sat, 15 Jun 2024 00:29:09 +1000 Michael Ellerman wrote: > > > Message-Id: sucks, I want a link I can open with a single click. > > > > At your suggestion I switched to using https://msgid.link/ as the target > > for patch links, eg: > > > > Link: https://msgid.link/20240529123029.146953-2-mpe@ellerman.id.au > > > > Which gives the reader a hint that the link is just to the submission. > > > > I don't really care if the tag is "Link:", but it has to be a URL, not > > just a bare message-id that I have to cut and paste like it's the stone > > age. > > I just switch my scripts over to Message-Id: and applied it, and after > playing with it a little, I agree with the above sentiment. I like > having a link to the actual patch that I can just click on. The > message-id adds more steps to get there. I'm sure someone could easily come up with a script that parses the Message-Id trailer and opens lore.kernel.org in a web browser. You can then bind that to a key in mutt, and won't have to even click on a link :-) Jokes aside, I think trailers should be designed first and foremost to provide the data that is needed to solve the problems at hand. How to format that data (link vs. msg-id) for human consumption is secondary. > I'm going back to the link, but I agree with others, "Link:" should be > for the discussions. Perhaps we could use "Pulled-from:" ? -- Regards, Laurent Pinchart ^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [MAINTAINERS SUMMIT] [4/4] Discuss how to better prevent backports of commits that turn out to cause regressions 2024-06-14 14:29 ` Michael Ellerman ` (2 preceding siblings ...) 2024-06-14 14:43 ` Steven Rostedt @ 2024-06-16 1:13 ` Linus Torvalds 2024-06-16 3:28 ` Steven Rostedt ` (3 more replies) 3 siblings, 4 replies; 107+ messages in thread From: Linus Torvalds @ 2024-06-16 1:13 UTC (permalink / raw) To: Michael Ellerman, Michael S. Tsirkin, Paolo Bonzini, Takashi Iwai Cc: Konstantin Ryabitsev, Jan Kara, Thorsten Leemhuis, ksummit On Fri, 14 Jun 2024 at 07:29, Michael Ellerman <mpe@ellerman.id.au> wrote: > > Message-Id: sucks, I want a link I can open with a single click. !00% agreed. There is no way in hell I will endorse adding more of those completely *idiotic* "Message-ID" things. Yes, people use them. It's a damn shame. There is no excuse for that completely broken model. It's objectively and unquestionably worse than having a "link". Here's the thing: if that message-ID isn't public, then that line SHOULD NOT EXIST and is an actual real problem. I personally look at those, and go "is that actually available on lore?" And if that message-id _is_ public, then it has a link, and it's much easier for people to check. Ergo: there is absolutely zero reason to ever use Message-ID. People need to stop advocating that sh*t. And no, I'm not at all happy with the fact that apparently vhost and kvm has made it their thing. Paolo, Michael, Takashi, please put useful links, not those braindead message id's in your commit messages. Linus ^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [MAINTAINERS SUMMIT] [4/4] Discuss how to better prevent backports of commits that turn out to cause regressions 2024-06-16 1:13 ` Linus Torvalds @ 2024-06-16 3:28 ` Steven Rostedt 2024-06-16 4:59 ` Linus Torvalds 2024-06-16 7:26 ` Takashi Iwai ` (2 subsequent siblings) 3 siblings, 1 reply; 107+ messages in thread From: Steven Rostedt @ 2024-06-16 3:28 UTC (permalink / raw) To: Linus Torvalds Cc: Michael Ellerman, Michael S. Tsirkin, Paolo Bonzini, Takashi Iwai, Konstantin Ryabitsev, Jan Kara, Thorsten Leemhuis, ksummit On Sat, 15 Jun 2024 18:13:57 -0700 Linus Torvalds <torvalds@linux-foundation.org> wrote: > On Fri, 14 Jun 2024 at 07:29, Michael Ellerman <mpe@ellerman.id.au> wrote: > > > > Message-Id: sucks, I want a link I can open with a single click. > > !00% agreed. > > There is no way in hell I will endorse adding more of those completely > *idiotic* "Message-ID" things. > > Yes, people use them. It's a damn shame. > > There is no excuse for that completely broken model. It's objectively > and unquestionably worse than having a "link". > > Here's the thing: if that message-ID isn't public, then that line > SHOULD NOT EXIST and is an actual real problem. I personally look at > those, and go "is that actually available on lore?" > > And if that message-id _is_ public, then it has a link, and it's much > easier for people to check. > > Ergo: there is absolutely zero reason to ever use Message-ID. > > People need to stop advocating that sh*t. > After trying it for a brief period, I quickly came to the same conclusion. I didn't like it because right after implementing it, I needed to get back to the conversation and found I could no longer simply click on a link, and I abandoned the "Message-Id" idea. But I really like having a link to the patch I pulled, even if there was no conversation about it. I use it for finding previous versions and so on, which is useful for me. Now, one day you looked at one of my "Link:" tags and was disappointed that it didn't have a discussion behind it. I would like to differentiate links that have a discussion with those that just are "I pulled it from here". I don't like that I use "Link:" for both. I prefer that "Link:" goes back to a discussion, but I would like a separate tag for where the patch came from. What would you suggest? 1) Just keep using Links, and we can figure it out when we click on it. 2) Giving it a separate name: a) "Pulled-from:" b) "Submission:" c) Something else ? -- Steve ^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [MAINTAINERS SUMMIT] [4/4] Discuss how to better prevent backports of commits that turn out to cause regressions 2024-06-16 3:28 ` Steven Rostedt @ 2024-06-16 4:59 ` Linus Torvalds 2024-06-16 8:22 ` Paolo Bonzini ` (4 more replies) 0 siblings, 5 replies; 107+ messages in thread From: Linus Torvalds @ 2024-06-16 4:59 UTC (permalink / raw) To: Steven Rostedt Cc: Michael Ellerman, Michael S. Tsirkin, Paolo Bonzini, Takashi Iwai, Konstantin Ryabitsev, Jan Kara, Thorsten Leemhuis, ksummit On Sat, 15 Jun 2024 at 20:28, Steven Rostedt <rostedt@goodmis.org> wrote: > > I prefer that "Link:" goes back to a discussion, but I would like a > separate tag for where the patch came from. What would you suggest? I really don't see the advantage of a separate tag name, and I see actual and immediate disadvantages. The thing is, I want commit messages to be links, because I do *NOT* want people to be in the situation where they ask themselves "how do I look this ref up"? Yes, a message-ID is often easy to just plug into lore. But as you - and others - already noted, making it a link means that you don't have to "plug it into X" at all, and it just works in many different contexts. And lore does not index all email. Which gets me to the other reason I want a link, and why I want to *name* it "Link". Because when I say "link", I very much mean exactly that. It's not an URI, the key part really is that "L" for Link. It needs to actively point to something on the internet. It's not some random uniform resource identifier, it's an honest-to-goodness "this is the actual link to the information".. So I don't want things that point to something on your company intranet. Nor do I want identifiers to something in your mailbox. It's *not* supposed to be a message ID, exactly because to be meaningful, it has to point to *public* data, and it has to be a real link, and the tag name should make that clear. And that's why the name "Link:" is important too. Because part of this is very much a social contract: we are working on open source, and the keyword here is open. Using a "Link" name kind of mentally enforces that social contract. Yes, yes, others use git for their own nefarious reasons, and if you are working inside a company on some closed source thing, by *all* means have tags like "Closes-bug: 54321". But that's not what the kernel is. We have years of people wanting to add their own meaningless "bug ID" crap. Or other internal useless markers. And that is *explicitly* what I don't want, and why I want it to be completely obvious and very very explicit that the only thing that is valid is a real public link. And finally - if you applied the patch by just following a message ID with basically "b4" from lore, I think the source link is almost entirely worthless. Here's the thing: if you applied it unchanged from lore, you already have the email address and a date in the commit. Are you seriously saying that you can't find it based on that? Now, if you *base* your commit it on somebody elses work on the lists, you should most definitely say that, and say something like Based on patch submission by Xyz at [1] Link: https://lore.kernel.org/...../ [1] and that's _wonderful_. But if you just did "b4 am" and applied a patch, what's the advantage of including information that adds no real value? So a pure source link I still find to be of *very* questionable value, compared to things that have actual obvious real value: - bug reports - background discussion - pointers to earlier versions that didn't get committed so yes, I find it almost offensive when I have to debug a problem, and I find a Link: that I hope explains things, and all it just shows is the SAME DAMN INFORMATION that was in the commit already, and that I could trivially have found by just searching lore for the author and date. At that point, "Link:" is just wasting my time. And I'm not making up that "search lore for author and date range" thing. That's EXACTLY what I do. Not to find the original submission, but to find the discussion about things. Sometimes years prior. A few days ago, I literally did exactly that to find some background for a commit from 2011. Btw, as a realted issue, is why I also despise the syzkaller convention of hiding magic noise in other tags too, like Reported-by: syzbot+6a038377f0a594d7d44e@syzkaller.appspotmail.com because that's exactly the kind of "ok, how the f*%^ do I look this up" kind of noise. And yes, we have exactly that kind of noise in the kernel logs, and it's wrong. I didn't make that one up. Now, often - but sadly not at all always - we also end up having an actual link, eg Reported-by: syzbot+9bbe2de1bc9d470eb5fe@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=9bbe2de1bc9d470eb5fe so that actually says "Oh, look, _that_ is how I look up the noise". But I'd much rather just see "Link" over "Closes", and generally to the actual report on lore, if there was any discussion about it. Because from a kernel standpoint, if something causes problems, the fact that it _closed_ a bug report is not what is important, is it? No, the reason you want to look at that link is because the fix caused problems, and you want the background on it and the original report. So again, "Closes" is wrong. Why? Same damn reason: make it really really obvious that what we want is a *LINK*. Not a "syzbot ID". That's wrong for exactly the same reason "Message-ID:" is wrong. TL;DR: - if you add a "Link:" there should be some *value* to the link, over and above "I can find this on lore by just searching for it". - there are seldom any real reasons to use anything but "Link:", and we have absolutely years of people arguing for their own internal bug-IDs that argue *against* making it very very clear that it should be a valid link Thus endeth my rant. Linus ^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [MAINTAINERS SUMMIT] [4/4] Discuss how to better prevent backports of commits that turn out to cause regressions 2024-06-16 4:59 ` Linus Torvalds @ 2024-06-16 8:22 ` Paolo Bonzini 2024-06-16 9:05 ` Geert Uytterhoeven ` (3 subsequent siblings) 4 siblings, 0 replies; 107+ messages in thread From: Paolo Bonzini @ 2024-06-16 8:22 UTC (permalink / raw) To: Linus Torvalds, Steven Rostedt Cc: Michael Ellerman, Michael S. Tsirkin, Takashi Iwai, Konstantin Ryabitsev, Jan Kara, Thorsten Leemhuis, ksummit On 6/16/24 06:59, Linus Torvalds wrote: > Here's the thing: if you applied it unchanged from lore, you already > have the email address and a date in the commit. > > Are you seriously saying that you can't find it based on that? Sure I _can_ but it's not especially handy (bordering self-inflicted pain, tbh). Having the Message-ID of the URL makes it a lot easier to find the message in your mail program and reply to it if needed. I also used to have a script that tagged as "merged" any messages in my inbox for which the corresponding Message-ID trailer appeared in linux.git, but it broke at some point and I never fixed it... (In fact, that is IMO the main point in favor of Message-ID - use it for the plain message you're applying, so that it can be used to reply and for tracking purposes; use Link for the discussion that _prompted_ the patch to be created. But I'm not going to argue too much about it). Paolo > Now, if you*base* your commit it on somebody elses work on the lists, > you should most definitely say that, and say something like > > Based on patch submission by Xyz at [1] > > Link:https://lore.kernel.org/...../ [1] > > and that's_wonderful_. > > But if you just did "b4 am" and applied a patch, what's the advantage > of including information that adds no real value? ^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [MAINTAINERS SUMMIT] [4/4] Discuss how to better prevent backports of commits that turn out to cause regressions 2024-06-16 4:59 ` Linus Torvalds 2024-06-16 8:22 ` Paolo Bonzini @ 2024-06-16 9:05 ` Geert Uytterhoeven 2024-06-16 15:07 ` Steven Rostedt ` (2 subsequent siblings) 4 siblings, 0 replies; 107+ messages in thread From: Geert Uytterhoeven @ 2024-06-16 9:05 UTC (permalink / raw) To: Linus Torvalds Cc: Steven Rostedt, Michael Ellerman, Michael S. Tsirkin, Paolo Bonzini, Takashi Iwai, Konstantin Ryabitsev, Jan Kara, Thorsten Leemhuis, ksummit Hi Linus, On Sun, Jun 16, 2024 at 7:00 AM Linus Torvalds <torvalds@linux-foundation.org> wrote: > So a pure source link I still find to be of *very* questionable value, > compared to things that have actual obvious real value: > > - bug reports > > - background discussion > > - pointers to earlier versions that didn't get committed > > so yes, I find it almost offensive when I have to debug a problem, and > I find a Link: that I hope explains things, and all it just shows is > the SAME DAMN INFORMATION that was in the commit already, and that I > could trivially have found by just searching lore for the author and > date. > > At that point, "Link:" is just wasting my time. Searching lore for the author and date manually would have wasted more time ;-) The nice thing about the Link to the original patch is that it also serves as a link to information that did not exist yet at the time the patch was committed. I regularly run into an issue, bisect the problem, follow the link, to find a similar bug report and sometimes even a fix. I agree the perfect patch series would have in the cover letter: 1. A link to a discussion that started the development op the series, and/or earlier tries to solve the itch, 2. A link to the previously submitted version. All of these can be found by recursively following the Link in the commit. Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds ^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [MAINTAINERS SUMMIT] [4/4] Discuss how to better prevent backports of commits that turn out to cause regressions 2024-06-16 4:59 ` Linus Torvalds 2024-06-16 8:22 ` Paolo Bonzini 2024-06-16 9:05 ` Geert Uytterhoeven @ 2024-06-16 15:07 ` Steven Rostedt 2024-06-17 13:48 ` Dan Carpenter 2024-06-17 14:39 ` Konstantin Ryabitsev 4 siblings, 0 replies; 107+ messages in thread From: Steven Rostedt @ 2024-06-16 15:07 UTC (permalink / raw) To: Linus Torvalds Cc: Michael Ellerman, Michael S. Tsirkin, Paolo Bonzini, Takashi Iwai, Konstantin Ryabitsev, Jan Kara, Thorsten Leemhuis, ksummit On Sat, 15 Jun 2024 21:59:40 -0700 Linus Torvalds <torvalds@linux-foundation.org> wrote: > Yes, a message-ID is often easy to just plug into lore. But as you - > and others - already noted, making it a link means that you don't have > to "plug it into X" at all, and it just works in many different > contexts. > > And lore does not index all email. > [..] > And finally - if you applied the patch by just following a message ID > with basically "b4" from lore, I think the source link is almost > entirely worthless. > > Here's the thing: if you applied it unchanged from lore, you already > have the email address and a date in the commit. > > Are you seriously saying that you can't find it based on that? As you stated above. I can find it, but it takes a bit of work to do so. Even more than a Message-Id. > > Now, if you *base* your commit it on somebody elses work on the lists, > you should most definitely say that, and say something like > > Based on patch submission by Xyz at [1] > > Link: https://lore.kernel.org/...../ [1] > > and that's _wonderful_. > > But if you just did "b4 am" and applied a patch, what's the advantage > of including information that adds no real value? Because it's an all or nothing approach. I'm not going to bother saying "oh, there's a discussion with this patch, I'll add the link. It's I either add a link or I don't. I have this automated to just make the link for the current patch. I also try to use the mailing list it came from. Is it to linux-trace-devel, or just lkml? That part tells me what the focus of the commit was for. Now, there's been several times I want to know a back story on some change that was done 5 years ago. But the discussion may be on one of a few commits to a part of code. It's nice to quickly pick the link and see where the discussion happened. It may take 5 commits to find the discussion, as one commit could lead to other commits. Yes, yes, I like to make verbose change logs where I don't need to see that discussion, but unfortunately, you can't always predict what part of a discussion would be relevant in five years and put that into the change log. > > So a pure source link I still find to be of *very* questionable value, > compared to things that have actual obvious real value: > > - bug reports > > - background discussion Which usually point to a source link. Especially if it is split between several versions. One thing I'm very diligent on, and I'm starting to see others do the same, is to have in every cover letter: Changes since v5: https://lore.kernel.org/all/20240612021642.941740855@goodmis.org/ Where once you find the link, you can easily go back and see the discussions from the previous link. > > - pointers to earlier versions that didn't get committed As mentioned above, I make sure all versions point to the previous versions, at least in the cover letters. This way you get to see all discussions on the change of code. > > so yes, I find it almost offensive when I have to debug a problem, and > I find a Link: that I hope explains things, and all it just shows is > the SAME DAMN INFORMATION that was in the commit already, and that I > could trivially have found by just searching lore for the author and > date. > > At that point, "Link:" is just wasting my time. > > And I'm not making up that "search lore for author and date range" > thing. That's EXACTLY what I do. Not to find the original submission, > but to find the discussion about things. Sometimes years prior. > > A few days ago, I literally did exactly that to find some background > for a commit from 2011. > > Btw, as a realted issue, is why I also despise the syzkaller > convention of hiding magic noise in other tags too, like > > Reported-by: syzbot+6a038377f0a594d7d44e@syzkaller.appspotmail.com > > because that's exactly the kind of "ok, how the f*%^ do I look this > up" kind of noise. > > And yes, we have exactly that kind of noise in the kernel logs, and > it's wrong. I didn't make that one up. > > Now, often - but sadly not at all always - we also end up having an > actual link, eg > > Reported-by: syzbot+9bbe2de1bc9d470eb5fe@syzkaller.appspotmail.com > Closes: https://syzkaller.appspot.com/bug?extid=9bbe2de1bc9d470eb5fe > > so that actually says "Oh, look, _that_ is how I look up the noise". > But I'd much rather just see "Link" over "Closes", and generally to > the actual report on lore, if there was any discussion about it. > Because from a kernel standpoint, if something causes problems, the > fact that it _closed_ a bug report is not what is important, is it? > No, the reason you want to look at that link is because the fix caused > problems, and you want the background on it and the original report. > > So again, "Closes" is wrong. Why? Same damn reason: make it really > really obvious that what we want is a *LINK*. Not a "syzbot ID". > That's wrong for exactly the same reason "Message-ID:" is wrong. I can see them using Closes as a way to show that a bug in a database is in mainline. Some tags are for computers to read and not humans. But I also agree that there should be tags for humans as well. The same discussion I had with Greg about "Depends-on". > > TL;DR: Usually that goes before the too long part ;-) -- Steve > > - if you add a "Link:" there should be some *value* to the link, over > and above "I can find this on lore by just searching for it". > > - there are seldom any real reasons to use anything but "Link:", and > we have absolutely years of people arguing for their own internal > bug-IDs that argue *against* making it very very clear that it should > be a valid link > > Thus endeth my rant. > > Linus ^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [MAINTAINERS SUMMIT] [4/4] Discuss how to better prevent backports of commits that turn out to cause regressions 2024-06-16 4:59 ` Linus Torvalds ` (2 preceding siblings ...) 2024-06-16 15:07 ` Steven Rostedt @ 2024-06-17 13:48 ` Dan Carpenter 2024-06-17 15:23 ` Dan Carpenter 2024-06-17 14:39 ` Konstantin Ryabitsev 4 siblings, 1 reply; 107+ messages in thread From: Dan Carpenter @ 2024-06-17 13:48 UTC (permalink / raw) To: Linus Torvalds Cc: Steven Rostedt, Michael Ellerman, Michael S. Tsirkin, Paolo Bonzini, Takashi Iwai, Konstantin Ryabitsev, Jan Kara, Thorsten Leemhuis, ksummit To me, the explicit links to the lore thread are useful because when I'm reporting static checker bugs, I can reply to the thread and CC all the correct people. I guess if we have multiple Link: tags then probably the last one is going to be the link to the thread. So that could work... regards, dan carpenter ^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [MAINTAINERS SUMMIT] [4/4] Discuss how to better prevent backports of commits that turn out to cause regressions 2024-06-17 13:48 ` Dan Carpenter @ 2024-06-17 15:23 ` Dan Carpenter 0 siblings, 0 replies; 107+ messages in thread From: Dan Carpenter @ 2024-06-17 15:23 UTC (permalink / raw) To: Linus Torvalds Cc: Steven Rostedt, Michael Ellerman, Michael S. Tsirkin, Paolo Bonzini, Takashi Iwai, Konstantin Ryabitsev, Jan Kara, Thorsten Leemhuis, ksummit On Mon, Jun 17, 2024 at 04:48:03PM +0300, Dan Carpenter wrote: > I guess if we have multiple Link: tags then probably the last one is > going to be the link to the thread. So that could work... Hm... It doesn't really work. :/ The very first email I looked at had one link tag which pointed to the v1 version of the patch and we applied the v3 version. I guess the Link tag was intended to have been below the --- cut off line in this case. The email was from two months ago. If I searched by the function name, then it was the result 44. If I searched by the author it was result 113. Searching through lore is not fun at all. regards, dan carpenter ^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [MAINTAINERS SUMMIT] [4/4] Discuss how to better prevent backports of commits that turn out to cause regressions 2024-06-16 4:59 ` Linus Torvalds ` (3 preceding siblings ...) 2024-06-17 13:48 ` Dan Carpenter @ 2024-06-17 14:39 ` Konstantin Ryabitsev 2024-06-17 16:04 ` Paul E. McKenney 2024-06-18 12:05 ` Michael Ellerman 4 siblings, 2 replies; 107+ messages in thread From: Konstantin Ryabitsev @ 2024-06-17 14:39 UTC (permalink / raw) To: Linus Torvalds Cc: Steven Rostedt, Michael Ellerman, Michael S. Tsirkin, Paolo Bonzini, Takashi Iwai, Jan Kara, Thorsten Leemhuis, ksummit On Sat, Jun 15, 2024 at 09:59:40PM GMT, Linus Torvalds wrote: > And finally - if you applied the patch by just following a message ID > with basically "b4" from lore, I think the source link is almost > entirely worthless. I have to continue to disagree here. I need something *reliable* for automation to work. If automation fails even 10% of the time, it generates confusion. I have lots of reports from people where b4 was able to match 9 out of 10 commits because the author changed something minor in patch 8/10, and so the patch-id no longer matched. As a result, committers follow up with "why didn't you apply 8/10" and the maintainers then have to reply with "oh, I did, but b4 got confused." Message-IDs are the perfect solution to this problem -- they are a reliable mechanism to match a commit to the patch where it came from. I don't care if they are part of the Link: trailer, but I do care to know *which one* of the Link: trailers point at the original submission. If there are multiple Link: trailers pointing at lore, one for the patch submission, and another for a series dependency, discussion, or an alternative implementation of the same thing, then I no longer have a reliable course of action. > Here's the thing: if you applied it unchanged from lore, you already > have the email address and a date in the commit. > > Are you seriously saying that you can't find it based on that? There are situations where this is unreliable for automation: - the patch has the "From:" header inside the body that is different from the "From:" message header (this is why this would fail most commonly) - the patch has a "Date:" or "Subject:" headers inside the body that override the "Date:" or "Subject:" headers in the message - the author sends the series to a test list - the author sends the series for a pre-review to the newbies list ("hey, can someone quickly confirm that this looks good?") - the author sends the series to the wrong list, and then corrects themselves and sends it to the correct list - the author sends the same patch as part of multiple series, in the hopes that one of them gets through All of these cases would cause automation to fail. I understand the reasons why everyone hates having the "Message-ID:" trailer, and this is fine. Can I counter-propose that we have a unique URL for links specifically going to patch submissions from which the commits were made? I've been already recommending using the "msgid.link" domain, but I'll go a bit further and put forward the recommendation that: - commits MAY have Link: trailers indicating the exact origin of the patch. To distinguish these links from other Link: trailers that may lead to relevant online discussions, they should either use the "patch.msgid.link" domain, or indicate the nature of the link using the hash-notation. Examples: - Link: https://patch.msgid.link/message@id-here - Link: https://lore.kernel.org/message@id-here # patch This would satisfy both the need for automation to have a reliable way to find the origin of the commit, and clearly indicate the nature of the link for humans doing commit spelunking. -K ^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [MAINTAINERS SUMMIT] [4/4] Discuss how to better prevent backports of commits that turn out to cause regressions 2024-06-17 14:39 ` Konstantin Ryabitsev @ 2024-06-17 16:04 ` Paul E. McKenney 2024-06-17 16:06 ` Konstantin Ryabitsev 2024-06-18 12:05 ` Michael Ellerman 1 sibling, 1 reply; 107+ messages in thread From: Paul E. McKenney @ 2024-06-17 16:04 UTC (permalink / raw) To: Konstantin Ryabitsev Cc: Linus Torvalds, Steven Rostedt, Michael Ellerman, Michael S. Tsirkin, Paolo Bonzini, Takashi Iwai, Jan Kara, Thorsten Leemhuis, ksummit On Mon, Jun 17, 2024 at 10:39:14AM -0400, Konstantin Ryabitsev wrote: [ . . .] > I understand the reasons why everyone hates having the "Message-ID:" trailer, > and this is fine. Can I counter-propose that we have a unique URL for links > specifically going to patch submissions from which the commits were made? I've > been already recommending using the "msgid.link" domain, but I'll go a bit > further and put forward the recommendation that: > > - commits MAY have Link: trailers indicating the exact origin of the patch. To > distinguish these links from other Link: trailers that may lead to relevant > online discussions, they should either use the "patch.msgid.link" domain, or > indicate the nature of the link using the hash-notation. Examples: > > - Link: https://patch.msgid.link/message@id-here > - Link: https://lore.kernel.org/message@id-here # patch So for your message that I am replying to, this would be like these? - Link: https://patch.msgid.link/message@20240617-arboreal-industrious-hedgehog-5b84ae@meerkat - Link: https://lore.kernel.org/message@20240617-arboreal-industrious-hedgehog-5b84ae@meerkat # patch Or am I confusing keywords with variables in your canonical URLs? If so, could you please post an example? Thanx, Paul ^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [MAINTAINERS SUMMIT] [4/4] Discuss how to better prevent backports of commits that turn out to cause regressions 2024-06-17 16:04 ` Paul E. McKenney @ 2024-06-17 16:06 ` Konstantin Ryabitsev 2024-06-17 16:14 ` Paolo Bonzini 0 siblings, 1 reply; 107+ messages in thread From: Konstantin Ryabitsev @ 2024-06-17 16:06 UTC (permalink / raw) To: Paul E. McKenney Cc: Linus Torvalds, Steven Rostedt, Michael Ellerman, Michael S. Tsirkin, Paolo Bonzini, Takashi Iwai, Jan Kara, Thorsten Leemhuis, ksummit On Mon, Jun 17, 2024 at 09:04:15AM GMT, Paul E. McKenney wrote: > So for your message that I am replying to, this would be like these? > > - Link: https://patch.msgid.link/message@20240617-arboreal-industrious-hedgehog-5b84ae@meerkat > - Link: https://lore.kernel.org/message@20240617-arboreal-industrious-hedgehog-5b84ae@meerkat # patch The "message@id-here" means the entire message-id: - Link: https://patch.msgid.link/20240617-arboreal-industrious-hedgehog-5b84ae@meerkat - Link: https://lore.kernel.org/20240617-arboreal-industrious-hedgehog-5b84ae@meerkat # patch -K ^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [MAINTAINERS SUMMIT] [4/4] Discuss how to better prevent backports of commits that turn out to cause regressions 2024-06-17 16:06 ` Konstantin Ryabitsev @ 2024-06-17 16:14 ` Paolo Bonzini 2024-06-17 16:18 ` Konstantin Ryabitsev 0 siblings, 1 reply; 107+ messages in thread From: Paolo Bonzini @ 2024-06-17 16:14 UTC (permalink / raw) To: Konstantin Ryabitsev, Paul E. McKenney Cc: Linus Torvalds, Steven Rostedt, Michael Ellerman, Michael S. Tsirkin, Takashi Iwai, Jan Kara, Thorsten Leemhuis, ksummit On 6/17/24 18:06, Konstantin Ryabitsev wrote: > On Mon, Jun 17, 2024 at 09:04:15AM GMT, Paul E. McKenney wrote: >> So for your message that I am replying to, this would be like these? >> >> - Link: https://patch.msgid.link/message@20240617-arboreal-industrious-hedgehog-5b84ae@meerkat >> - Link: https://lore.kernel.org/message@20240617-arboreal-industrious-hedgehog-5b84ae@meerkat # patch > > The "message@id-here" means the entire message-id: > > - Link: https://patch.msgid.link/20240617-arboreal-industrious-hedgehog-5b84ae@meerkat > - Link: https://lore.kernel.org/20240617-arboreal-industrious-hedgehog-5b84ae@meerkat # patch Two questions: 1) just one is needed, right? (should go without saying, but still) 2) Is the "/r/MESSAGE-ID" format (https://lore.kernel.org/r/20240617-arboreal-industrious-hedgehog-5b84ae@meerkat) not valid or deprecated? And of course, to Linus's point from yesterday, this would only apply to patches that _did_ come from a mailing list that is indexed by lore. Paolo ^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [MAINTAINERS SUMMIT] [4/4] Discuss how to better prevent backports of commits that turn out to cause regressions 2024-06-17 16:14 ` Paolo Bonzini @ 2024-06-17 16:18 ` Konstantin Ryabitsev 2024-06-17 17:11 ` Geert Uytterhoeven 0 siblings, 1 reply; 107+ messages in thread From: Konstantin Ryabitsev @ 2024-06-17 16:18 UTC (permalink / raw) To: Paolo Bonzini Cc: Paul E. McKenney, Linus Torvalds, Steven Rostedt, Michael Ellerman, Michael S. Tsirkin, Takashi Iwai, Jan Kara, Thorsten Leemhuis, ksummit On Mon, Jun 17, 2024 at 06:14:48PM GMT, Paolo Bonzini wrote: > > - Link: https://patch.msgid.link/20240617-arboreal-industrious-hedgehog-5b84ae@meerkat > > - Link: https://lore.kernel.org/20240617-arboreal-industrious-hedgehog-5b84ae@meerkat # patch > > Two questions: > > 1) just one is needed, right? (should go without saying, but still) Yes, either-or. I just need to know which link takes me to the original patch. > 2) Is the "/r/MESSAGE-ID" format (https://lore.kernel.org/r/20240617-arboreal-industrious-hedgehog-5b84ae@meerkat) > not valid or deprecated? It's valid, but /r/ has been unnecessary for ages. > And of course, to Linus's point from yesterday, this would only apply to > patches that _did_ come from a mailing list that is indexed by lore. Of course. -K ^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [MAINTAINERS SUMMIT] [4/4] Discuss how to better prevent backports of commits that turn out to cause regressions 2024-06-17 16:18 ` Konstantin Ryabitsev @ 2024-06-17 17:11 ` Geert Uytterhoeven 0 siblings, 0 replies; 107+ messages in thread From: Geert Uytterhoeven @ 2024-06-17 17:11 UTC (permalink / raw) To: Konstantin Ryabitsev Cc: Paolo Bonzini, Paul E. McKenney, Linus Torvalds, Steven Rostedt, Michael Ellerman, Michael S. Tsirkin, Takashi Iwai, Jan Kara, Thorsten Leemhuis, ksummit Hi Konstantin, On Mon, Jun 17, 2024 at 6:57 PM Konstantin Ryabitsev <konstantin@linuxfoundation.org> wrote: > On Mon, Jun 17, 2024 at 06:14:48PM GMT, Paolo Bonzini wrote: > > > - Link: https://patch.msgid.link/20240617-arboreal-industrious-hedgehog-5b84ae@meerkat > > > - Link: https://lore.kernel.org/20240617-arboreal-industrious-hedgehog-5b84ae@meerkat # patch > > > > Two questions: > > > > 1) just one is needed, right? (should go without saying, but still) > > Yes, either-or. I just need to know which link takes me to the original patch. > > > 2) Is the "/r/MESSAGE-ID" format (https://lore.kernel.org/r/20240617-arboreal-industrious-hedgehog-5b84ae@meerkat) > > not valid or deprecated? > > It's valid, but /r/ has been unnecessary for ages. Care to update https://docs.kernel.org/maintainer/configure-git.html?highlight=lore.kernel.org/r/? Thanks! Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds ^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [MAINTAINERS SUMMIT] [4/4] Discuss how to better prevent backports of commits that turn out to cause regressions 2024-06-17 14:39 ` Konstantin Ryabitsev 2024-06-17 16:04 ` Paul E. McKenney @ 2024-06-18 12:05 ` Michael Ellerman 1 sibling, 0 replies; 107+ messages in thread From: Michael Ellerman @ 2024-06-18 12:05 UTC (permalink / raw) To: Konstantin Ryabitsev, Linus Torvalds Cc: Steven Rostedt, Michael S. Tsirkin, Paolo Bonzini, Takashi Iwai, Jan Kara, Thorsten Leemhuis, ksummit Konstantin Ryabitsev <konstantin@linuxfoundation.org> writes: > ... > > I understand the reasons why everyone hates having the "Message-ID:" trailer, > and this is fine. Can I counter-propose that we have a unique URL for links > specifically going to patch submissions from which the commits were made? I've > been already recommending using the "msgid.link" domain, but I'll go a bit > further and put forward the recommendation that: > > - commits MAY have Link: trailers indicating the exact origin of the patch. To > distinguish these links from other Link: trailers that may lead to relevant > online discussions, they should either use the "patch.msgid.link" domain, or > indicate the nature of the link using the hash-notation. Examples: > > - Link: https://patch.msgid.link/message@id-here This is the better option. The fact that it's the patch link is right there at the start of the line "patch.msgid.link", and will always be in the same place visually, which helps human readers trying to recognise it amongst other links. > - Link: https://lore.kernel.org/message@id-here # patch Here you have to read all the way to the end of the line to see that it's the patch. And it is worse with longer message ids, eg: - Link: https://lore.kernel.org/message@20240617-arboreal-industrious-hedgehog-5b84ae@meerkat # patch The "# patch" is almost off the edge of the screen. It's also a bit easier to grep for. > This would satisfy both the need for automation to have a reliable way to find > the origin of the commit, and clearly indicate the nature of the link for > humans doing commit spelunking. +100 cheers ^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [MAINTAINERS SUMMIT] [4/4] Discuss how to better prevent backports of commits that turn out to cause regressions 2024-06-16 1:13 ` Linus Torvalds 2024-06-16 3:28 ` Steven Rostedt @ 2024-06-16 7:26 ` Takashi Iwai 2024-06-16 8:10 ` Paolo Bonzini 2024-06-16 8:31 ` Jiri Kosina 3 siblings, 0 replies; 107+ messages in thread From: Takashi Iwai @ 2024-06-16 7:26 UTC (permalink / raw) To: Linus Torvalds Cc: Michael Ellerman, Michael S. Tsirkin, Paolo Bonzini, Takashi Iwai, Konstantin Ryabitsev, Jan Kara, Thorsten Leemhuis, ksummit On Sun, 16 Jun 2024 03:13:57 +0200, Linus Torvalds wrote: > > On Fri, 14 Jun 2024 at 07:29, Michael Ellerman <mpe@ellerman.id.au> wrote: > > > > Message-Id: sucks, I want a link I can open with a single click. > > !00% agreed. > > There is no way in hell I will endorse adding more of those completely > *idiotic* "Message-ID" things. > > Yes, people use them. It's a damn shame. > > There is no excuse for that completely broken model. It's objectively > and unquestionably worse than having a "link". > > Here's the thing: if that message-ID isn't public, then that line > SHOULD NOT EXIST and is an actual real problem. I personally look at > those, and go "is that actually available on lore?" > > And if that message-id _is_ public, then it has a link, and it's much > easier for people to check. > > Ergo: there is absolutely zero reason to ever use Message-ID. > > People need to stop advocating that sh*t. > > And no, I'm not at all happy with the fact that apparently vhost and > kvm has made it their thing. > > Paolo, Michael, Takashi, please put useful links, not those braindead > message id's in your commit messages. Sorry for that. I used to convert Message-Id to Link in the past in a git hook, but this got broken after migrating my workstation to a new machine, so you saw it in my previous few PRs as a side effect. I believe it's already fixed in the last PR. Takashi ^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [MAINTAINERS SUMMIT] [4/4] Discuss how to better prevent backports of commits that turn out to cause regressions 2024-06-16 1:13 ` Linus Torvalds 2024-06-16 3:28 ` Steven Rostedt 2024-06-16 7:26 ` Takashi Iwai @ 2024-06-16 8:10 ` Paolo Bonzini 2024-06-16 11:31 ` Laurent Pinchart ` (2 more replies) 2024-06-16 8:31 ` Jiri Kosina 3 siblings, 3 replies; 107+ messages in thread From: Paolo Bonzini @ 2024-06-16 8:10 UTC (permalink / raw) To: Linus Torvalds, Michael Ellerman, Michael S. Tsirkin, Takashi Iwai Cc: Konstantin Ryabitsev, Jan Kara, Thorsten Leemhuis, ksummit On 6/16/24 03:13, Linus Torvalds wrote: > And no, I'm not at all happy with the fact that apparently vhost and > kvm has made it their thing. > > Paolo, Michael, Takashi, please put useful links, not those braindead > message id's in your commit messages. Ok, ok. Before lore existed, there was no service that I can remember that archived messages with a message-id in the URL. So, for example Gmane links would be useless now, and patchwork links are not really something I'd trust for long-term archival either. These days, it's mostly just that I have set am.message-id to true years ago; but since lore is managed by kernel.org, we can expect the URLs to be stable and the original reason to use Message-ID is obsolete. Having learnt right now about the applypatch-msg git hook, I've stuck a sed -i -e 's,^Message-ID: <\(.*\)>$,Link: https://lore.kernel.org/r/\1,' "$1" in there which should do the trick. I guess Michael and Takashi can do the same. :) By the way, if you use Firefox, you can do the following two steps to install a search plugin that searches lore by Message-ID: - first go to https://mycroftproject.com/install.html?id=121759&basename=lore_kernel_org&icontype=png&name=lore.kernel.org to install the search engine (an XML file, you can see it at https://mycroftproject.com/installos.php/121759/lore_kernel_org.xml). - then go to about:preferences#search and add a search shortcut On Chrome instead you can add https://lore.kernel.org/r/%s at chrome://settings/searchEngines. (Apart from git commit messages, I use it also with the https://addons.thunderbird.net/en-us/thunderbird/addon/copy-message-id/ extension for Thunderbird). Paolo ^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [MAINTAINERS SUMMIT] [4/4] Discuss how to better prevent backports of commits that turn out to cause regressions 2024-06-16 8:10 ` Paolo Bonzini @ 2024-06-16 11:31 ` Laurent Pinchart 2024-06-16 11:39 ` Takashi Iwai 2024-06-16 16:40 ` Linus Torvalds 2 siblings, 0 replies; 107+ messages in thread From: Laurent Pinchart @ 2024-06-16 11:31 UTC (permalink / raw) To: Paolo Bonzini Cc: Linus Torvalds, Michael Ellerman, Michael S. Tsirkin, Takashi Iwai, Konstantin Ryabitsev, Jan Kara, Thorsten Leemhuis, ksummit On Sun, Jun 16, 2024 at 10:10:25AM +0200, Paolo Bonzini wrote: > On 6/16/24 03:13, Linus Torvalds wrote: > > And no, I'm not at all happy with the fact that apparently vhost and > > kvm has made it their thing. > > > > Paolo, Michael, Takashi, please put useful links, not those braindead > > message id's in your commit messages. > > Ok, ok. Before lore existed, there was no service that I can remember > that archived messages with a message-id in the URL. So, for example > Gmane links would be useless now, and patchwork links are not really > something I'd trust for long-term archival either. > > These days, it's mostly just that I have set am.message-id to true years > ago; but since lore is managed by kernel.org, we can expect the URLs to > be stable and the original reason to use Message-ID is obsolete. Having > learnt right now about the applypatch-msg git hook, I've stuck a > > sed -i -e 's,^Message-ID: <\(.*\)>$,Link: https://lore.kernel.org/r/\1,' > "$1" > > in there which should do the trick. I guess Michael and Takashi can do > the same. :) > > > By the way, if you use Firefox, you can do the following two steps to > install a search plugin that searches lore by Message-ID: > > - first go to > https://mycroftproject.com/install.html?id=121759&basename=lore_kernel_org&icontype=png&name=lore.kernel.org > to install the search engine (an XML file, you can see it at > https://mycroftproject.com/installos.php/121759/lore_kernel_org.xml). > > - then go to about:preferences#search and add a search shortcut You can also add a smart bookmark: - Name: 'Lore' (or whatever you want as a name) - URL: https://lore.kernel.org/r/%s - Keyword: 'lore' (or whatever you want as a keyword) Entering 'lore <msg-id>' in the URL bar will then do the right thing. > On Chrome instead you can add https://lore.kernel.org/r/%s at > chrome://settings/searchEngines. > > (Apart from git commit messages, I use it also with the > https://addons.thunderbird.net/en-us/thunderbird/addon/copy-message-id/ > extension for Thunderbird). -- Regards, Laurent Pinchart ^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [MAINTAINERS SUMMIT] [4/4] Discuss how to better prevent backports of commits that turn out to cause regressions 2024-06-16 8:10 ` Paolo Bonzini 2024-06-16 11:31 ` Laurent Pinchart @ 2024-06-16 11:39 ` Takashi Iwai 2024-06-16 16:40 ` Linus Torvalds 2 siblings, 0 replies; 107+ messages in thread From: Takashi Iwai @ 2024-06-16 11:39 UTC (permalink / raw) To: Paolo Bonzini Cc: Linus Torvalds, Michael Ellerman, Michael S. Tsirkin, Takashi Iwai, Konstantin Ryabitsev, Jan Kara, Thorsten Leemhuis, ksummit On Sun, 16 Jun 2024 10:10:25 +0200, Paolo Bonzini wrote: > > On 6/16/24 03:13, Linus Torvalds wrote: > > And no, I'm not at all happy with the fact that apparently vhost and > > kvm has made it their thing. > > > > Paolo, Michael, Takashi, please put useful links, not those braindead > > message id's in your commit messages. > > Ok, ok. Before lore existed, there was no service that I can remember > that archived messages with a message-id in the URL. So, for example > Gmane links would be useless now, and patchwork links are not really > something I'd trust for long-term archival either. > > These days, it's mostly just that I have set am.message-id to true > years ago; but since lore is managed by kernel.org, we can expect the > URLs to be stable and the original reason to use Message-ID is > obsolete. Having learnt right now about the applypatch-msg git hook, > I've stuck a > > sed -i -e 's,^Message-ID: <\(.*\)>$,Link: > https://lore.kernel.org/r/\1,' "$1" > > in there which should do the trick. I guess Michael and Takashi can > do the same. :) Heh, that's actually what I've done already, but my script had "Message-Id" or such to match and it didn't hit any longer after some mail change; as a result, Message-id tag remained. You'd better to make it case-insensitive match. HTH, Takashi ^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [MAINTAINERS SUMMIT] [4/4] Discuss how to better prevent backports of commits that turn out to cause regressions 2024-06-16 8:10 ` Paolo Bonzini 2024-06-16 11:31 ` Laurent Pinchart 2024-06-16 11:39 ` Takashi Iwai @ 2024-06-16 16:40 ` Linus Torvalds 2 siblings, 0 replies; 107+ messages in thread From: Linus Torvalds @ 2024-06-16 16:40 UTC (permalink / raw) To: Paolo Bonzini Cc: Michael Ellerman, Michael S. Tsirkin, Takashi Iwai, Konstantin Ryabitsev, Jan Kara, Thorsten Leemhuis, ksummit On Sun, 16 Jun 2024 at 01:10, Paolo Bonzini <pbonzini@redhat.com> wrote: > > These days, it's mostly just that I have set am.message-id to true years > ago; but since lore is managed by kernel.org, we can expect the URLs to > be stable and the original reason to use Message-ID is obsolete. Having > learnt right now about the applypatch-msg git hook, I've stuck a > > sed -i -e 's,^Message-ID: <\(.*\)>$,Link: https://lore.kernel.org/r/\1,' > "$1" > > in there which should do the trick. I guess Michael and Takashi can do > the same. :) Please don't. At least not unless you are *SURE* that you only pick up those messages from lore. IOW, where did that original brain-dead Message-ID tag originate from? Because if you use "git am", then using "-l" already does the right thing. So the actual fix is to *NOT* use "git am -i", but "git am -l". Do not fix up the mess after the fact, particularly with a script that can actually corrupt things and make up links that don't exist. Just fix the mess at the source. Linus ^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [MAINTAINERS SUMMIT] [4/4] Discuss how to better prevent backports of commits that turn out to cause regressions 2024-06-16 1:13 ` Linus Torvalds ` (2 preceding siblings ...) 2024-06-16 8:10 ` Paolo Bonzini @ 2024-06-16 8:31 ` Jiri Kosina 2024-06-16 8:54 ` Geert Uytterhoeven 3 siblings, 1 reply; 107+ messages in thread From: Jiri Kosina @ 2024-06-16 8:31 UTC (permalink / raw) To: Linus Torvalds Cc: Michael Ellerman, Michael S. Tsirkin, Paolo Bonzini, Takashi Iwai, Konstantin Ryabitsev, Jan Kara, Thorsten Leemhuis, ksummit On Sat, 15 Jun 2024, Linus Torvalds wrote: > There is no excuse for that completely broken model. It's objectively > and unquestionably worse than having a "link". I think the 'philosophy' behind favoring Message-id over Link is that Message-id is set in stone forever, while Link is not. Should lore go away at some point in the future, something else will probably take over, and you'll be able to search for Message-id there, but the Link will not be functional any more (sure, you can extract the Message-id: from it manually). -- Jiri Kosina SUSE Labs ^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [MAINTAINERS SUMMIT] [4/4] Discuss how to better prevent backports of commits that turn out to cause regressions 2024-06-16 8:31 ` Jiri Kosina @ 2024-06-16 8:54 ` Geert Uytterhoeven 0 siblings, 0 replies; 107+ messages in thread From: Geert Uytterhoeven @ 2024-06-16 8:54 UTC (permalink / raw) To: Jiri Kosina Cc: Linus Torvalds, Michael Ellerman, Michael S. Tsirkin, Paolo Bonzini, Takashi Iwai, Konstantin Ryabitsev, Jan Kara, Thorsten Leemhuis, ksummit Hi Jiri, On Sun, Jun 16, 2024 at 10:32 AM Jiri Kosina <jikos@kernel.org> wrote: > On Sat, 15 Jun 2024, Linus Torvalds wrote: > > There is no excuse for that completely broken model. It's objectively > > and unquestionably worse than having a "link". > > I think the 'philosophy' behind favoring Message-id over Link is that > Message-id is set in stone forever, while Link is not. > > Should lore go away at some point in the future, something else will > probably take over, and you'll be able to search for Message-id there, but > the Link will not be functional any more (sure, you can extract the > Message-id: from it manually). The premise is that kernel.org won't go away. And if it ever goes away, we map it to its replacement in our local /etc/dnsmasq.conf ;-) Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds ^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [MAINTAINERS SUMMIT] [4/4] Discuss how to better prevent backports of commits that turn out to cause regressions 2024-06-13 9:59 ` Jan Kara 2024-06-13 10:18 ` Thorsten Leemhuis 2024-06-13 14:08 ` Konstantin Ryabitsev @ 2024-06-13 19:39 ` Dan Carpenter 2024-06-14 1:00 ` Steven Rostedt 2 siblings, 1 reply; 107+ messages in thread From: Dan Carpenter @ 2024-06-13 19:39 UTC (permalink / raw) To: Jan Kara; +Cc: Thorsten Leemhuis, ksummit On Thu, Jun 13, 2024 at 11:59:17AM +0200, Jan Kara wrote: > FWIW I (and a few other maintainers) use 'Message-Id' tag to link to > submission. These are great. What I wish is that someone added that to Patchwork. KTODO: Add Message-Id tag support to patchwork (KTODO is like a when you say a wish and throw a coin into a fountain except it doesn't cost you a quarter). regards, dan carpenter ^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [MAINTAINERS SUMMIT] [4/4] Discuss how to better prevent backports of commits that turn out to cause regressions 2024-06-13 19:39 ` Dan Carpenter @ 2024-06-14 1:00 ` Steven Rostedt 0 siblings, 0 replies; 107+ messages in thread From: Steven Rostedt @ 2024-06-14 1:00 UTC (permalink / raw) To: Dan Carpenter; +Cc: Jan Kara, Thorsten Leemhuis, ksummit On Thu, 13 Jun 2024 22:39:48 +0300 Dan Carpenter <dan.carpenter@linaro.org> wrote: > On Thu, Jun 13, 2024 at 11:59:17AM +0200, Jan Kara wrote: > > FWIW I (and a few other maintainers) use 'Message-Id' tag to link to > > submission. > > These are great. What I wish is that someone added that to Patchwork. > > KTODO: Add Message-Id tag support to patchwork > > (KTODO is like a when you say a wish and throw a coin into a fountain > except it doesn't cost you a quarter). > That would be great. I need to update my scripts. I pull from patchwork and then run a script that adds the message-id, but makes it a "Link" tag. Which reading this thread, I realize is wrong :-p I'll go update it now! (and repull some of my changes I'm getting for linux-next). -- Steve ^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [MAINTAINERS SUMMIT] [4/4] Discuss how to better prevent backports of commits that turn out to cause regressions 2024-06-13 8:42 ` [MAINTAINERS SUMMIT] [4/4] Discuss how to better prevent backports of commits that turn out to cause regressions Thorsten Leemhuis 2024-06-13 9:59 ` Jan Kara @ 2024-06-13 11:58 ` James Bottomley 2024-06-13 13:06 ` Sasha Levin 2024-06-13 13:45 ` Greg KH 2024-06-13 13:40 ` Sasha Levin 2024-06-13 14:28 ` Andrew Lunn 3 siblings, 2 replies; 107+ messages in thread From: James Bottomley @ 2024-06-13 11:58 UTC (permalink / raw) To: Thorsten Leemhuis, ksummit On Thu, 2024-06-13 at 10:42 +0200, Thorsten Leemhuis wrote: > The scenario shown at the start of the thread illustrates a problem I > see frequently: commits with a Fixes: tag end up in new to stable > series releases just days after being mainlined and cause regressions > -- just like they do in mainline, which just was not known yet at the > time of backporting. This happens extremely often right after merge > windows when huge piles of changes are backported to the stable trees > each cycle shortly after -rc1 is out (which even some kernel > developers apparently are somewhat afraid to test from what I've > seen). I haven't really observed this for curated fixes. For most subsystems, patches with Fixes tags that are cc'd to stable tend to go steadily outside the merge window. Obviously a few arrive within it, but usually at roughly the rate they arrive outside it. What I observe in the merge window is huge piles of patches go into stable *without* a cc:stable tag from the autosel machinery (and quite a few even without fixes: tags). So this does beg a couple of questions: Since you have the figures, what's the proportion of regressions caused by backports to stable without cc:stable tags? Could we fix a lot of this by delaying autosel? It tends to ramp up in the merge window when everyone is concentrating on other things, so any regressions it causes naturally get ignored for a couple of weeks. James ^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [MAINTAINERS SUMMIT] [4/4] Discuss how to better prevent backports of commits that turn out to cause regressions 2024-06-13 11:58 ` James Bottomley @ 2024-06-13 13:06 ` Sasha Levin 2024-06-13 13:56 ` James Bottomley 2024-06-13 13:45 ` Greg KH 1 sibling, 1 reply; 107+ messages in thread From: Sasha Levin @ 2024-06-13 13:06 UTC (permalink / raw) To: James Bottomley; +Cc: Thorsten Leemhuis, ksummit On Thu, Jun 13, 2024 at 07:58:58AM -0400, James Bottomley wrote: >On Thu, 2024-06-13 at 10:42 +0200, Thorsten Leemhuis wrote: >> The scenario shown at the start of the thread illustrates a problem I >> see frequently: commits with a Fixes: tag end up in new to stable >> series releases just days after being mainlined and cause regressions >> -- just like they do in mainline, which just was not known yet at the >> time of backporting. This happens extremely often right after merge >> windows when huge piles of changes are backported to the stable trees >> each cycle shortly after -rc1 is out (which even some kernel >> developers apparently are somewhat afraid to test from what I've >> seen). > >I haven't really observed this for curated fixes. For most subsystems, >patches with Fixes tags that are cc'd to stable tend to go steadily >outside the merge window. Obviously a few arrive within it, but >usually at roughly the rate they arrive outside it. > >What I observe in the merge window is huge piles of patches go into >stable *without* a cc:stable tag from the autosel machinery (and quite >a few even without fixes: tags). Could you provide a concrete example? This shouldn't happen. >So this does beg a couple of questions: > >Since you have the figures, what's the proportion of regressions caused >by backports to stable without cc:stable tags? This question came up two years ago and we had statistics around this. Autosel patches didn't cause more (actually, it was *less*) regressions than stable tagged ones. >Could we fix a lot of this by delaying autosel? It tends to ramp up in >the merge window when everyone is concentrating on other things, so any >regressions it causes naturally get ignored for a couple of weeks. autosel is currently delayed about 3-4 weeks, sometimes more. -- Thanks, Sasha ^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [MAINTAINERS SUMMIT] [4/4] Discuss how to better prevent backports of commits that turn out to cause regressions 2024-06-13 13:06 ` Sasha Levin @ 2024-06-13 13:56 ` James Bottomley 2024-06-13 14:02 ` Greg KH 2024-06-13 18:08 ` Sasha Levin 0 siblings, 2 replies; 107+ messages in thread From: James Bottomley @ 2024-06-13 13:56 UTC (permalink / raw) To: Sasha Levin; +Cc: Thorsten Leemhuis, ksummit On Thu, 2024-06-13 at 09:06 -0400, Sasha Levin wrote: > On Thu, Jun 13, 2024 at 07:58:58AM -0400, James Bottomley wrote: > > On Thu, 2024-06-13 at 10:42 +0200, Thorsten Leemhuis wrote: > > > The scenario shown at the start of the thread illustrates a > > > problem I see frequently: commits with a Fixes: tag end up in new > > > to stable series releases just days after being mainlined and > > > cause regressions -- just like they do in mainline, which just > > > was not known yet at the time of backporting. This happens > > > extremely often right after merge windows when huge piles of > > > changes are backported to the stable trees each cycle shortly > > > after -rc1 is out (which even some kernel developers apparently > > > are somewhat afraid to test from what I've > > > seen). > > > > I haven't really observed this for curated fixes. For most > > subsystems, patches with Fixes tags that are cc'd to stable tend to > > go steadily outside the merge window. Obviously a few arrive > > within it, but usually at roughly the rate they arrive outside it. > > > > What I observe in the merge window is huge piles of patches go into > > stable *without* a cc:stable tag from the autosel machinery (and > > quite a few even without fixes: tags). > > Could you provide a concrete example? This shouldn't happen. This one has no fixes or cc stable: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=37f1663c91934f664fb850306708094a324c227c Yet here it is backported: Message-id: 20240603121056.1837607-1-sashal@kernel.org (I can't give a lore reference because conveniently it was a personal cc with no tracked mailing list). I picked that one because we discovered a bug with the strlcpy to strscpy conversions in SCSI which it looks like this backport has. > > So this does beg a couple of questions: > > > > Since you have the figures, what's the proportion of regressions > > caused by backports to stable without cc:stable tags? > > This question came up two years ago and we had statistics around > this. Autosel patches didn't cause more (actually, it was *less*) > regressions than stable tagged ones. OK, so Thorsten's stats should bear this out then ... > > Could we fix a lot of this by delaying autosel? It tends to ramp > > up in the merge window when everyone is concentrating on other > > things, so any regressions it causes naturally get ignored for a > > couple of weeks. > > autosel is currently delayed about 3-4 weeks, sometimes more. That's fairly recent then. When I look at 6.8 autosel began its flood pretty much the moment the first SCSI pull request went in to the merge window. Checking 6.9 it seems to be about 10 days after ... has that made a difference, or is it too early to tell? James ^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [MAINTAINERS SUMMIT] [4/4] Discuss how to better prevent backports of commits that turn out to cause regressions 2024-06-13 13:56 ` James Bottomley @ 2024-06-13 14:02 ` Greg KH 2024-06-13 15:11 ` James Bottomley 2024-06-13 18:08 ` Sasha Levin 1 sibling, 1 reply; 107+ messages in thread From: Greg KH @ 2024-06-13 14:02 UTC (permalink / raw) To: James Bottomley; +Cc: Sasha Levin, Thorsten Leemhuis, ksummit On Thu, Jun 13, 2024 at 09:56:56AM -0400, James Bottomley wrote: > On Thu, 2024-06-13 at 09:06 -0400, Sasha Levin wrote: > > On Thu, Jun 13, 2024 at 07:58:58AM -0400, James Bottomley wrote: > > > On Thu, 2024-06-13 at 10:42 +0200, Thorsten Leemhuis wrote: > > > > The scenario shown at the start of the thread illustrates a > > > > problem I see frequently: commits with a Fixes: tag end up in new > > > > to stable series releases just days after being mainlined and > > > > cause regressions -- just like they do in mainline, which just > > > > was not known yet at the time of backporting. This happens > > > > extremely often right after merge windows when huge piles of > > > > changes are backported to the stable trees each cycle shortly > > > > after -rc1 is out (which even some kernel developers apparently > > > > are somewhat afraid to test from what I've > > > > seen). > > > > > > I haven't really observed this for curated fixes. For most > > > subsystems, patches with Fixes tags that are cc'd to stable tend to > > > go steadily outside the merge window. Obviously a few arrive > > > within it, but usually at roughly the rate they arrive outside it. > > > > > > What I observe in the merge window is huge piles of patches go into > > > stable *without* a cc:stable tag from the autosel machinery (and > > > quite a few even without fixes: tags). > > > > Could you provide a concrete example? This shouldn't happen. > > This one has no fixes or cc stable: > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=37f1663c91934f664fb850306708094a324c227c > > Yet here it is backported: > > Message-id: 20240603121056.1837607-1-sashal@kernel.org > > (I can't give a lore reference because conveniently it was a personal > cc with no tracked mailing list). > > I picked that one because we discovered a bug with the strlcpy to > strscpy conversions in SCSI which it looks like this backport has. It says, in the commit message: Stable-dep-of: c3408c4ae041 ("scsi: qla2xxx: Avoid possible run-time warning with long model_num") That is why it was backported. thanks, greg k-h ^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [MAINTAINERS SUMMIT] [4/4] Discuss how to better prevent backports of commits that turn out to cause regressions 2024-06-13 14:02 ` Greg KH @ 2024-06-13 15:11 ` James Bottomley 2024-06-13 16:27 ` Greg KH 2024-06-14 18:47 ` Sasha Levin 0 siblings, 2 replies; 107+ messages in thread From: James Bottomley @ 2024-06-13 15:11 UTC (permalink / raw) To: Greg KH; +Cc: Sasha Levin, Thorsten Leemhuis, ksummit On Thu, 2024-06-13 at 16:02 +0200, Greg KH wrote: > On Thu, Jun 13, 2024 at 09:56:56AM -0400, James Bottomley wrote: > > On Thu, 2024-06-13 at 09:06 -0400, Sasha Levin wrote: [...] > > > Could you provide a concrete example? This shouldn't happen. > > > > This one has no fixes or cc stable: > > > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=37f1663c91934f664fb850306708094a324c227c > > > > Yet here it is backported: > > > > Message-id: 20240603121056.1837607-1-sashal@kernel.org > > > > (I can't give a lore reference because conveniently it was a > > personal cc with no tracked mailing list). > > > > I picked that one because we discovered a bug with the strlcpy to > > strscpy conversions in SCSI which it looks like this backport has. > > It says, in the commit message: > Stable-dep-of: c3408c4ae041 ("scsi: qla2xxx: Avoid possible > run-time warning with long model_num") > > That is why it was backported. Well, that still tracks back to a patch which wasn't tagged: c3408c4ae041 is actually fixing a bug in 527e9b704c3d which is another of the strlcpy to strscpy patches which also has no cc:stable or fixes tag: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=527e9b704c3d James ^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [MAINTAINERS SUMMIT] [4/4] Discuss how to better prevent backports of commits that turn out to cause regressions 2024-06-13 15:11 ` James Bottomley @ 2024-06-13 16:27 ` Greg KH 2024-06-14 18:47 ` Sasha Levin 1 sibling, 0 replies; 107+ messages in thread From: Greg KH @ 2024-06-13 16:27 UTC (permalink / raw) To: James Bottomley; +Cc: Sasha Levin, Thorsten Leemhuis, ksummit On Thu, Jun 13, 2024 at 11:11:54AM -0400, James Bottomley wrote: > On Thu, 2024-06-13 at 16:02 +0200, Greg KH wrote: > > On Thu, Jun 13, 2024 at 09:56:56AM -0400, James Bottomley wrote: > > > On Thu, 2024-06-13 at 09:06 -0400, Sasha Levin wrote: > [...] > > > > Could you provide a concrete example? This shouldn't happen. > > > > > > This one has no fixes or cc stable: > > > > > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=37f1663c91934f664fb850306708094a324c227c > > > > > > Yet here it is backported: > > > > > > Message-id: 20240603121056.1837607-1-sashal@kernel.org > > > > > > (I can't give a lore reference because conveniently it was a > > > personal cc with no tracked mailing list). > > > > > > I picked that one because we discovered a bug with the strlcpy to > > > strscpy conversions in SCSI which it looks like this backport has. > > > > It says, in the commit message: > > Stable-dep-of: c3408c4ae041 ("scsi: qla2xxx: Avoid possible > > run-time warning with long model_num") > > > > That is why it was backported. > > Well, that still tracks back to a patch which wasn't tagged: > c3408c4ae041 is actually fixing a bug in 527e9b704c3d which is another > of the strlcpy to strscpy patches which also has no cc:stable or fixes > tag: > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=527e9b704c3d True, kind of impossible for us to ever figure that one out :( Care to send that to us on stable@vger.kernel.org so that we know to queue it up? thanks, greg k-h ^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [MAINTAINERS SUMMIT] [4/4] Discuss how to better prevent backports of commits that turn out to cause regressions 2024-06-13 15:11 ` James Bottomley 2024-06-13 16:27 ` Greg KH @ 2024-06-14 18:47 ` Sasha Levin 2024-06-17 10:59 ` Vlastimil Babka 1 sibling, 1 reply; 107+ messages in thread From: Sasha Levin @ 2024-06-14 18:47 UTC (permalink / raw) To: James Bottomley; +Cc: Greg KH, Thorsten Leemhuis, ksummit On Thu, Jun 13, 2024 at 11:11:54AM -0400, James Bottomley wrote: >On Thu, 2024-06-13 at 16:02 +0200, Greg KH wrote: >> On Thu, Jun 13, 2024 at 09:56:56AM -0400, James Bottomley wrote: >> > On Thu, 2024-06-13 at 09:06 -0400, Sasha Levin wrote: >[...] >> > > Could you provide a concrete example? This shouldn't happen. >> > >> > This one has no fixes or cc stable: >> > >> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=37f1663c91934f664fb850306708094a324c227c >> > >> > Yet here it is backported: >> > >> > Message-id: 20240603121056.1837607-1-sashal@kernel.org >> > >> > (I can't give a lore reference because conveniently it was a >> > personal cc with no tracked mailing list). >> > >> > I picked that one because we discovered a bug with the strlcpy to >> > strscpy conversions in SCSI which it looks like this backport has. >> >> It says, in the commit message: >> Stable-dep-of: c3408c4ae041 ("scsi: qla2xxx: Avoid possible >> run-time warning with long model_num") >> >> That is why it was backported. > >Well, that still tracks back to a patch which wasn't tagged: >c3408c4ae041 is actually fixing a bug in 527e9b704c3d which is another >of the strlcpy to strscpy patches which also has no cc:stable or fixes >tag: > >https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=527e9b704c3d Nor was it ever backported to any stable tree... What am I missing? -- Thanks, Sasha ^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [MAINTAINERS SUMMIT] [4/4] Discuss how to better prevent backports of commits that turn out to cause regressions 2024-06-14 18:47 ` Sasha Levin @ 2024-06-17 10:59 ` Vlastimil Babka 0 siblings, 0 replies; 107+ messages in thread From: Vlastimil Babka @ 2024-06-17 10:59 UTC (permalink / raw) To: Sasha Levin, James Bottomley; +Cc: Greg KH, Thorsten Leemhuis, ksummit On 6/14/24 8:47 PM, Sasha Levin wrote: > On Thu, Jun 13, 2024 at 11:11:54AM -0400, James Bottomley wrote: >>On Thu, 2024-06-13 at 16:02 +0200, Greg KH wrote: >>> On Thu, Jun 13, 2024 at 09:56:56AM -0400, James Bottomley wrote: >>> > On Thu, 2024-06-13 at 09:06 -0400, Sasha Levin wrote: >>[...] >>> > > Could you provide a concrete example? This shouldn't happen. >>> > >>> > This one has no fixes or cc stable: >>> > >>> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=37f1663c91934f664fb850306708094a324c227c >>> > >>> > Yet here it is backported: >>> > >>> > Message-id: 20240603121056.1837607-1-sashal@kernel.org >>> > >>> > (I can't give a lore reference because conveniently it was a >>> > personal cc with no tracked mailing list). >>> > >>> > I picked that one because we discovered a bug with the strlcpy to >>> > strscpy conversions in SCSI which it looks like this backport has. >>> >>> It says, in the commit message: >>> Stable-dep-of: c3408c4ae041 ("scsi: qla2xxx: Avoid possible >>> run-time warning with long model_num") >>> >>> That is why it was backported. >> >>Well, that still tracks back to a patch which wasn't tagged: >>c3408c4ae041 is actually fixing a bug in 527e9b704c3d which is another >>of the strlcpy to strscpy patches which also has no cc:stable or fixes >>tag: >> >>https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=527e9b704c3d > > Nor was it ever backported to any stable tree... What am I missing? What I'm missing is why 37f1663c919 was backported with "Stable-dep-of: c3408c4ae041" while c3408c4ae041 was not backported? ^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [MAINTAINERS SUMMIT] [4/4] Discuss how to better prevent backports of commits that turn out to cause regressions 2024-06-13 13:56 ` James Bottomley 2024-06-13 14:02 ` Greg KH @ 2024-06-13 18:08 ` Sasha Levin 1 sibling, 0 replies; 107+ messages in thread From: Sasha Levin @ 2024-06-13 18:08 UTC (permalink / raw) To: James Bottomley; +Cc: Thorsten Leemhuis, ksummit On Thu, Jun 13, 2024 at 09:56:56AM -0400, James Bottomley wrote: >On Thu, 2024-06-13 at 09:06 -0400, Sasha Levin wrote: >> On Thu, Jun 13, 2024 at 07:58:58AM -0400, James Bottomley wrote: >> > On Thu, 2024-06-13 at 10:42 +0200, Thorsten Leemhuis wrote: >> > > The scenario shown at the start of the thread illustrates a >> > > problem I see frequently: commits with a Fixes: tag end up in new >> > > to stable series releases just days after being mainlined and >> > > cause regressions -- just like they do in mainline, which just >> > > was not known yet at the time of backporting. This happens >> > > extremely often right after merge windows when huge piles of >> > > changes are backported to the stable trees each cycle shortly >> > > after -rc1 is out (which even some kernel developers apparently >> > > are somewhat afraid to test from what I've >> > > seen). >> > >> > I haven't really observed this for curated fixes. For most >> > subsystems, patches with Fixes tags that are cc'd to stable tend to >> > go steadily outside the merge window. Obviously a few arrive >> > within it, but usually at roughly the rate they arrive outside it. >> > >> > What I observe in the merge window is huge piles of patches go into >> > stable *without* a cc:stable tag from the autosel machinery (and >> > quite a few even without fixes: tags). >> >> Could you provide a concrete example? This shouldn't happen. > >This one has no fixes or cc stable: > >https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=37f1663c91934f664fb850306708094a324c227c > >Yet here it is backported: > >Message-id: 20240603121056.1837607-1-sashal@kernel.org > >(I can't give a lore reference because conveniently it was a personal >cc with no tracked mailing list). > >I picked that one because we discovered a bug with the strlcpy to >strscpy conversions in SCSI which it looks like this backport has. In this case, we picked up the commit because it's a dependency for 527e9b704c3d ("scsi: qla2xxx: Use memcpy() and strlcpy() instead of strcpy() and strncpy()"), it didn't come in via autosel. >> > So this does beg a couple of questions: >> > >> > Since you have the figures, what's the proportion of regressions >> > caused by backports to stable without cc:stable tags? >> >> This question came up two years ago and we had statistics around >> this. Autosel patches didn't cause more (actually, it was *less*) >> regressions than stable tagged ones. > >OK, so Thorsten's stats should bear this out then ... Yup, this is an experiment we started about that time. We've extended autosel to be about 4 weeks behind where Linus is, and wanted to look at the statistics some time later to see if it improved anything. I would note here that even two years ago, autosel commits were slightly "safer" than stable tagged commits (w.r.t the odds of having a follow-up commit pointing back with a Fixes: tag.). >> > Could we fix a lot of this by delaying autosel? It tends to ramp >> > up in the merge window when everyone is concentrating on other >> > things, so any regressions it causes naturally get ignored for a >> > couple of weeks. >> >> autosel is currently delayed about 3-4 weeks, sometimes more. > >That's fairly recent then. When I look at 6.8 autosel began its flood >pretty much the moment the first SCSI pull request went in to the merge >window. Checking 6.9 it seems to be about 10 days after ... has that >made a difference, or is it too early to tell? The mails may come in during the merge window, but the commits aren't merged until after 3-4 weeks after (we just present them for review early). In the 6.8 case, for example, the first autosel commit went into v6.8.6, which was released about a month after the merge window closed. -- Thanks, Sasha ^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [MAINTAINERS SUMMIT] [4/4] Discuss how to better prevent backports of commits that turn out to cause regressions 2024-06-13 11:58 ` James Bottomley 2024-06-13 13:06 ` Sasha Levin @ 2024-06-13 13:45 ` Greg KH 1 sibling, 0 replies; 107+ messages in thread From: Greg KH @ 2024-06-13 13:45 UTC (permalink / raw) To: James Bottomley; +Cc: Thorsten Leemhuis, ksummit On Thu, Jun 13, 2024 at 07:58:58AM -0400, James Bottomley wrote: > On Thu, 2024-06-13 at 10:42 +0200, Thorsten Leemhuis wrote: > > The scenario shown at the start of the thread illustrates a problem I > > see frequently: commits with a Fixes: tag end up in new to stable > > series releases just days after being mainlined and cause regressions > > -- just like they do in mainline, which just was not known yet at the > > time of backporting. This happens extremely often right after merge > > windows when huge piles of changes are backported to the stable trees > > each cycle shortly after -rc1 is out (which even some kernel > > developers apparently are somewhat afraid to test from what I've > > seen). > > I haven't really observed this for curated fixes. For most subsystems, > patches with Fixes tags that are cc'd to stable tend to go steadily > outside the merge window. Obviously a few arrive within it, but > usually at roughly the rate they arrive outside it. > > What I observe in the merge window is huge piles of patches go into > stable *without* a cc:stable tag from the autosel machinery (and quite > a few even without fixes: tags). The merge window has a huge number of patches sent to Linus _with_ a Fixes: tag, or a cc: stable tag. It's our busiest time of the cycle by far. But overall, it's still a smaller % of the patches that end up in the tree overall, so it looks big to us and everyone else, but it's really not. The % going in during the end -rc cycles is still higher, as it rightfully should. Only patches I see in our trees that do not come from autosel without fixes: tags should have a stable-dep-of: tag, OR it is because someone has sent it to us for explicit inclusion. If you see stuff that does not meet that criteria, please let us know. thanks, greg k-h ^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [MAINTAINERS SUMMIT] [4/4] Discuss how to better prevent backports of commits that turn out to cause regressions 2024-06-13 8:42 ` [MAINTAINERS SUMMIT] [4/4] Discuss how to better prevent backports of commits that turn out to cause regressions Thorsten Leemhuis 2024-06-13 9:59 ` Jan Kara 2024-06-13 11:58 ` James Bottomley @ 2024-06-13 13:40 ` Sasha Levin 2024-06-18 13:12 ` Thorsten Leemhuis 2024-06-13 14:28 ` Andrew Lunn 3 siblings, 1 reply; 107+ messages in thread From: Sasha Levin @ 2024-06-13 13:40 UTC (permalink / raw) To: Thorsten Leemhuis; +Cc: ksummit On Thu, Jun 13, 2024 at 10:42:01AM +0200, Thorsten Leemhuis wrote: >I would like to discuss how to better prevent backports of mainline >commits to stable that turn out to cause regressions. If you can tell us which backports cause regression we promise not to backport them :) But more seriously: >* For patches that are tagged for backporting it's easy to for >developers to influence the timing, as they can use a stable tag like >`Cc: <stable@vger.kernel.org> # after -rc4` to delay backporting (see >Documentation/process/stable-kernel-rules.rst for details). But for We can delay, but in practice many of the regressions are discovered *because* they land in stable. There is relatively little testing on -rc releases. I'd argue that in most cases, delaying until a later point in time will just mean that the issue is discovered later, which isn't helpful... [ snip ] >* We could ask the stable team to only backport changes once they have >been in mainline for a certain time (something like "at the earliest two >weeks after the change was present in a mainline release or We could, but is the net result positive? This also means that fixes for real issues take longer to get to users. It would make sense if most backports cause a regression. Is it the case? >pre-release"?). But to not delay urgent fixes we then would need >developers to mark the urgent ones somehow. That is likely a hard sell, >but maybe less so then what the previous point outlined; untangling >could help here, too. I'd argue that even developers don't necessarily know if something is "urgent" or not. Heck, what does "urgent" mean? There are so many usecases for the kernel that it's impossible to define what is urgent and what is not. >* Maybe convince the stable team to consider all commits with just a >Fixes: tag as "non urgent", if they were merged during a merge window >with a committer (or author?) date from before the merge window -- and >then only backport them after -rc4 to ensure they got at least three >weeks of mainline testing before they are backported. This is imperfect >and has downsides, but would be relatively simple to realize. The tricky part here is that we can't rely on stable tags for importance determination. Individuals and subsystems simply don't add stable tags because they don't want to, not because their commits are not important or urgent. >* We could extend the Fixes tag in a fashion similar to the stable tag >(see above) to establish something like `Fixes: cafec0cacafe ("foo: bar: >foobar baz") # after -rc4 if considered backportworthy` -- but some of >these lines will become awfully long (they already are occasionally even >without this add-on note). See above :) -- Thanks, Sasha ^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [MAINTAINERS SUMMIT] [4/4] Discuss how to better prevent backports of commits that turn out to cause regressions 2024-06-13 13:40 ` Sasha Levin @ 2024-06-18 13:12 ` Thorsten Leemhuis 0 siblings, 0 replies; 107+ messages in thread From: Thorsten Leemhuis @ 2024-06-18 13:12 UTC (permalink / raw) To: Sasha Levin; +Cc: ksummit On 13.06.24 15:40, Sasha Levin wrote: > On Thu, Jun 13, 2024 at 10:42:01AM +0200, Thorsten Leemhuis wrote: >> I would like to discuss how to better prevent backports of mainline >> commits to stable that turn out to cause regressions. > If you can tell us which backports cause regression we promise not to > backport them :) :) FWIW (as you know, but others might not): I sometimes already do when I notice such a problem (of course that only works if the regression is tracked already). But the machinery and workflow for it could definitely be improved on my side; it's on my todo list, but so are many other things. :-/ >> * We could ask the stable team to only backport changes once they have >> been in mainline for a certain time (something like "at the earliest two >> weeks after the change was present in a mainline release or > > We could, but is the net result positive? This also means that fixes for > real issues take longer to get to users. Well, if the fix is that important urgent it should have been merged in the previous cycle and not have waited for the merge window. > It would make sense if most backports cause a regression. Is it the > case? I can't answer that -- and the data I have is likely to incomplete for that, as I don't become aware of all regressions. >> pre-release"?). But to not delay urgent fixes we then would need >> developers to mark the urgent ones somehow. That is likely a hard sell, >> but maybe less so then what the previous point outlined; untangling >> could help here, too. > > I'd argue that even developers don't necessarily know if something is > "urgent" or not. Heck, what does "urgent" mean? There are so many > usecases for the kernel that it's impossible to define what is urgent > and what is not. Yup. :-/ >> * Maybe convince the stable team to consider all commits with just a >> Fixes: tag as "non urgent", if they were merged during a merge window >> with a committer (or author?) date from before the merge window -- and >> then only backport them after -rc4 to ensure they got at least three >> weeks of mainline testing before they are backported. This is imperfect >> and has downsides, but would be relatively simple to realize. > > The tricky part here is that we can't rely on stable tags for importance > determination. Individuals and subsystems simply don't add stable tags > because they don't want to, not because their commits are not important > or urgent. I know, I know. :-( That's why I introduced that section with "few thoughts my brain came up with", as I myself was unsure how to best improve the situation. Ciao, Thorsten ^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [MAINTAINERS SUMMIT] [4/4] Discuss how to better prevent backports of commits that turn out to cause regressions 2024-06-13 8:42 ` [MAINTAINERS SUMMIT] [4/4] Discuss how to better prevent backports of commits that turn out to cause regressions Thorsten Leemhuis ` (2 preceding siblings ...) 2024-06-13 13:40 ` Sasha Levin @ 2024-06-13 14:28 ` Andrew Lunn 2024-06-13 18:14 ` Sasha Levin 3 siblings, 1 reply; 107+ messages in thread From: Andrew Lunn @ 2024-06-13 14:28 UTC (permalink / raw) To: Thorsten Leemhuis; +Cc: ksummit > * One cause of regressions that happen in stable trees (and not in > mainline) I've seen quite a few times are backports of commits with > Fixes: tags that were part of a patch-series and depend on earlier > patches from the series. The stable-team afaics has no easy way to spot > this, as there is no way to check "was this change part of a series". This sounds like a tooling issue. git send-email knows a patch is part of a patch series. Maybe it should be adding some sort of cross reference between patches in a patch series. Andrew ^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [MAINTAINERS SUMMIT] [4/4] Discuss how to better prevent backports of commits that turn out to cause regressions 2024-06-13 14:28 ` Andrew Lunn @ 2024-06-13 18:14 ` Sasha Levin 2024-06-14 14:41 ` Jan Kara 0 siblings, 1 reply; 107+ messages in thread From: Sasha Levin @ 2024-06-13 18:14 UTC (permalink / raw) To: Andrew Lunn; +Cc: Thorsten Leemhuis, ksummit On Thu, Jun 13, 2024 at 04:28:47PM +0200, Andrew Lunn wrote: >> * One cause of regressions that happen in stable trees (and not in >> mainline) I've seen quite a few times are backports of commits with >> Fixes: tags that were part of a patch-series and depend on earlier >> patches from the series. The stable-team afaics has no easy way to spot >> this, as there is no way to check "was this change part of a series". > >This sounds like a tooling issue. git send-email knows a patch is part >of a patch series. Maybe it should be adding some sort of cross >reference between patches in a patch series. This came up in the past, and we have some machinery to check if a commit is part of a series or not, but in practice most of the series we see are actually not ones where patches depend on each other. -- Thanks, Sasha ^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [MAINTAINERS SUMMIT] [4/4] Discuss how to better prevent backports of commits that turn out to cause regressions 2024-06-13 18:14 ` Sasha Levin @ 2024-06-14 14:41 ` Jan Kara 2024-06-14 15:03 ` Rafael J. Wysocki 0 siblings, 1 reply; 107+ messages in thread From: Jan Kara @ 2024-06-14 14:41 UTC (permalink / raw) To: Sasha Levin; +Cc: Andrew Lunn, Thorsten Leemhuis, ksummit On Thu 13-06-24 14:14:47, Sasha Levin wrote: > On Thu, Jun 13, 2024 at 04:28:47PM +0200, Andrew Lunn wrote: > > > * One cause of regressions that happen in stable trees (and not in > > > mainline) I've seen quite a few times are backports of commits with > > > Fixes: tags that were part of a patch-series and depend on earlier > > > patches from the series. The stable-team afaics has no easy way to spot > > > this, as there is no way to check "was this change part of a series". > > > > This sounds like a tooling issue. git send-email knows a patch is part > > of a patch series. Maybe it should be adding some sort of cross > > reference between patches in a patch series. > > This came up in the past, and we have some machinery to check if a > commit is part of a series or not, but in practice most of the series we > see are actually not ones where patches depend on each other. I'm not sure I understand. Do you say most of the fixes you apply are from single-patch series? Or if the series has multiple patches, how do you decide whether some patch depends on other ones in the series or not? Because judging that sometimes requires rather detailed knowledge of the involved subsystem... Honza -- Jan Kara <jack@suse.com> SUSE Labs, CR ^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [MAINTAINERS SUMMIT] [4/4] Discuss how to better prevent backports of commits that turn out to cause regressions 2024-06-14 14:41 ` Jan Kara @ 2024-06-14 15:03 ` Rafael J. Wysocki 2024-06-14 17:46 ` Sasha Levin 0 siblings, 1 reply; 107+ messages in thread From: Rafael J. Wysocki @ 2024-06-14 15:03 UTC (permalink / raw) To: Jan Kara; +Cc: Sasha Levin, Andrew Lunn, Thorsten Leemhuis, ksummit On Fri, Jun 14, 2024 at 4:42 PM Jan Kara <jack@suse.cz> wrote: > > On Thu 13-06-24 14:14:47, Sasha Levin wrote: > > On Thu, Jun 13, 2024 at 04:28:47PM +0200, Andrew Lunn wrote: > > > > * One cause of regressions that happen in stable trees (and not in > > > > mainline) I've seen quite a few times are backports of commits with > > > > Fixes: tags that were part of a patch-series and depend on earlier > > > > patches from the series. The stable-team afaics has no easy way to spot > > > > this, as there is no way to check "was this change part of a series". > > > > > > This sounds like a tooling issue. git send-email knows a patch is part > > > of a patch series. Maybe it should be adding some sort of cross > > > reference between patches in a patch series. > > > > This came up in the past, and we have some machinery to check if a > > commit is part of a series or not, but in practice most of the series we > > see are actually not ones where patches depend on each other. > > I'm not sure I understand. Do you say most of the fixes you apply are > from single-patch series? Or if the series has multiple patches, how do you > decide whether some patch depends on other ones in the series or not? > Because judging that sometimes requires rather detailed knowledge of the > involved subsystem... Well, not always. If the series is of the "clean up this same thing all over the place" type, you can easily say that there are no dependencies between patches in it. ^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [MAINTAINERS SUMMIT] [4/4] Discuss how to better prevent backports of commits that turn out to cause regressions 2024-06-14 15:03 ` Rafael J. Wysocki @ 2024-06-14 17:46 ` Sasha Levin 0 siblings, 0 replies; 107+ messages in thread From: Sasha Levin @ 2024-06-14 17:46 UTC (permalink / raw) To: Rafael J. Wysocki; +Cc: Jan Kara, Andrew Lunn, Thorsten Leemhuis, ksummit On Fri, Jun 14, 2024 at 05:03:17PM +0200, Rafael J. Wysocki wrote: >On Fri, Jun 14, 2024 at 4:42 PM Jan Kara <jack@suse.cz> wrote: >> >> On Thu 13-06-24 14:14:47, Sasha Levin wrote: >> > On Thu, Jun 13, 2024 at 04:28:47PM +0200, Andrew Lunn wrote: >> > > > * One cause of regressions that happen in stable trees (and not in >> > > > mainline) I've seen quite a few times are backports of commits with >> > > > Fixes: tags that were part of a patch-series and depend on earlier >> > > > patches from the series. The stable-team afaics has no easy way to spot >> > > > this, as there is no way to check "was this change part of a series". >> > > >> > > This sounds like a tooling issue. git send-email knows a patch is part >> > > of a patch series. Maybe it should be adding some sort of cross >> > > reference between patches in a patch series. >> > >> > This came up in the past, and we have some machinery to check if a >> > commit is part of a series or not, but in practice most of the series we >> > see are actually not ones where patches depend on each other. >> >> I'm not sure I understand. Do you say most of the fixes you apply are >> from single-patch series? Or if the series has multiple patches, how do you >> decide whether some patch depends on other ones in the series or not? >> Because judging that sometimes requires rather detailed knowledge of the >> involved subsystem... > >Well, not always. If the series is of the "clean up this same thing >all over the place" type, you can easily say that there are no >dependencies between patches in it. Few subsystems also use good old patchbombs instead of pull requests, which leads to patches being part of those series without actually having anything to do with eachother. I guess my point is that it's not as simple as looking whether a commit is in a series or not. -- Thanks, Sasha ^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [MAINTAINERS SUMMIT] [0/4] Common scenario for four proposals regarding regressions 2024-06-13 8:22 [MAINTAINERS SUMMIT] [0/4] Common scenario for four proposals regarding regressions Thorsten Leemhuis ` (3 preceding siblings ...) 2024-06-13 8:42 ` [MAINTAINERS SUMMIT] [4/4] Discuss how to better prevent backports of commits that turn out to cause regressions Thorsten Leemhuis @ 2024-06-18 14:43 ` James Bottomley 2024-06-18 15:50 ` Mark Brown 2024-06-20 10:32 ` Thorsten Leemhuis 4 siblings, 2 replies; 107+ messages in thread From: James Bottomley @ 2024-06-18 14:43 UTC (permalink / raw) To: Thorsten Leemhuis, ksummit On Thu, 2024-06-13 at 10:22 +0200, Thorsten Leemhuis wrote: > Lo! I prepared four proposals for the maintainers summit regarding > regressions I'll send in reply to this mail. They are somewhat > related and address different aspects of one scenario I see > frequently in different variations; so instead of repeating that > scenario in slightly modified form in each of the proposals, I'm > putting it out here once: I think you're missing a piece here about how we actually find regressions. A lot, it is true, come from test suites running on the mainline. However, for obscure drivers and even some more complex dependencies, the regression sometimes isn't discovered until it gets into the hands of the wider pool of testers, often via stable. This is important, because it emphasizes that zero regressions in stable is impossible (and thus preventing backporting patches that cause regressions is also impossible) if stable is the vehicle by which some regressions are discovered. Plus it also means that a backport delay or cadence would actually delay discovery of some regressions because the patches that cause them won't be seen by the configs that run into them until they get put into stable. There's also a longer delay in discovery of the actual upstream commit because bugs in stable need to be reproduced or at least identified in mainline before we can fix them and the discoverers often have a harder time than mainline users in helping with this. This stable being both a vehicle for fixed kernels and a testing platform for regressions is a tension I don't think we can (or should) resolve. So what should we do about this? I think the first thing is to recognize the important role stable plays in actually finding bugs. There already is a -rc tree for stable, but it doesn't actually seem to be very useful in finding bugs (likely because the pool of testers is too small), so perhaps we should discuss whether we could expand this, or whether we really accept that non-rc stable is part of our testing infrastructure. The other thing I think would help is better tooling and advice to help reporters find regressions in stable. What we do a lot upstream is ask if they can reproduce it in mainline. However, not everyone is equipped to test out mainline kernels, so we could do with helping them bisect it in stable (note this can be time dependent: older stable trees more naturally give rise to the question "has this been fixed upstream" making mainline testing more of an imperative). Regards, James ^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [MAINTAINERS SUMMIT] [0/4] Common scenario for four proposals regarding regressions 2024-06-18 14:43 ` [MAINTAINERS SUMMIT] [0/4] Common scenario for four proposals regarding regressions James Bottomley @ 2024-06-18 15:50 ` Mark Brown 2024-06-20 10:32 ` Thorsten Leemhuis 1 sibling, 0 replies; 107+ messages in thread From: Mark Brown @ 2024-06-18 15:50 UTC (permalink / raw) To: James Bottomley; +Cc: Thorsten Leemhuis, ksummit [-- Attachment #1: Type: text/plain, Size: 1265 bytes --] On Tue, Jun 18, 2024 at 10:43:49AM -0400, James Bottomley wrote: > So what should we do about this? I think the first thing is to > recognize the important role stable plays in actually finding bugs. > There already is a -rc tree for stable, but it doesn't actually seem to > be very useful in finding bugs (likely because the pool of testers is > too small), so perhaps we should discuss whether we could expand this, > or whether we really accept that non-rc stable is part of our testing > infrastructure. The pool of testers is quite small, and the turnarounds for responses are relatively tight which precludes certain kinds of testing. > The other thing I think would help is better tooling and advice to help > reporters find regressions in stable. What we do a lot upstream is ask > if they can reproduce it in mainline. However, not everyone is > equipped to test out mainline kernels, so we could do with helping them > bisect it in stable (note this can be time dependent: older stable > trees more naturally give rise to the question "has this been fixed > upstream" making mainline testing more of an imperative). Also questions like "can I get this building and running without reworking my development infrastructure". [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 488 bytes --] ^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [MAINTAINERS SUMMIT] [0/4] Common scenario for four proposals regarding regressions 2024-06-18 14:43 ` [MAINTAINERS SUMMIT] [0/4] Common scenario for four proposals regarding regressions James Bottomley 2024-06-18 15:50 ` Mark Brown @ 2024-06-20 10:32 ` Thorsten Leemhuis 2024-06-20 12:57 ` James Bottomley 1 sibling, 1 reply; 107+ messages in thread From: Thorsten Leemhuis @ 2024-06-20 10:32 UTC (permalink / raw) To: James Bottomley, ksummit On 18.06.24 16:43, James Bottomley wrote: > On Thu, 2024-06-13 at 10:22 +0200, Thorsten Leemhuis wrote: >> Lo! I prepared four proposals for the maintainers summit regarding >> regressions I'll send in reply to this mail. They are somewhat >> related and address different aspects of one scenario I see >> frequently in different variations; so instead of repeating that >> scenario in slightly modified form in each of the proposals, I'm >> putting it out here once: > > I think you're missing a piece here about how we actually find > regressions. A lot, it is true, come from test suites running on the > mainline. Sure. > However, for obscure drivers and even some more complex > dependencies, the regression sometimes isn't discovered until it gets > into the hands of the wider pool of testers, often via stable. > > This is important, because it emphasizes that zero regressions in > stable is impossible (and thus preventing backporting patches that > cause regressions is also impossible) if stable is the vehicle by which > some regressions are discovered. Of course "Zero regressions in stable is impossible" as we are dealing with software. ;) And of course even with delayed backport for non-urgent fixes some problems would make it through. But right now users testing mainline sometimes hardly have a chance to test and report problems with mainline in time to prevent a backport. Take Linux 6.7.2 (released 2024-01-25 23:58 UTC) with its 640 changes for example, where users had only 4 days to do so, as almost all of its changes had been merged for 6.8-rc1 (2024-01-21 22:23 UTC). FWIW: 200 of those changes were committed to some subsystem git tree during January, 363 during December, 70 during November, and 7 during October. So if those 440 fixes could wait some time to be mainlined and were not important enough to get into 6.7 (2024-01-07 20:29 UTC) in the first place, why the rush backporting them to 6.7.y so quickly after the merge window? All that leads to the related question "How many of those changes maybe should have gone into 6.7?". And maybe even "Should we somehow try to motivate more people to try -next?". But those are different problems. And the situation regarding the first already got somewhat better from what I can see -- among others afaics due to me prodding people when the queue fixes for recent regression for the -next merge window. > Plus it also means that a backport > delay or cadence would actually delay discovery of some regressions > because the patches that cause them won't be seen by the configs that > run into them until they get put into stable. And why is that a problem? > [...] > > The other thing I think would help is better tooling and advice to help > reporters find regressions in stable. What we do a lot upstream is ask > if they can reproduce it in mainline. However, not everyone is > equipped to test out mainline kernels, so we could do with helping them > bisect it in stable FWIW Documentation/admin-guide/verify-bugs-and-bisect-regressions.rst / https://docs.kernel.org/admin-guide/verify-bugs-and-bisect-regressions.html covers this: users that notice a regression in a stable tree will bisect that tree. But before... > (note this can be time dependent: older stable > trees more naturally give rise to the question "has this been fixed > upstream" making mainline testing more of an imperative). ...it does so, but tells users to try mainline for two reasons: * It might be fixed there already. * When Greg receives a regression report for stable he'll usually ask "is mainline also affected" anyway to figure out if this is something he or somebody else has to look into. And some of the mainline developer will ask this, too. Ciao, Thorsten ^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [MAINTAINERS SUMMIT] [0/4] Common scenario for four proposals regarding regressions 2024-06-20 10:32 ` Thorsten Leemhuis @ 2024-06-20 12:57 ` James Bottomley 2024-06-20 13:55 ` Mark Brown 2024-06-20 16:59 ` Thorsten Leemhuis 0 siblings, 2 replies; 107+ messages in thread From: James Bottomley @ 2024-06-20 12:57 UTC (permalink / raw) To: Thorsten Leemhuis, ksummit On Thu, 2024-06-20 at 12:32 +0200, Thorsten Leemhuis wrote: > On 18.06.24 16:43, James Bottomley wrote: > > On Thu, 2024-06-13 at 10:22 +0200, Thorsten Leemhuis wrote: > > > Lo! I prepared four proposals for the maintainers summit > > > regarding regressions I'll send in reply to this mail. They are > > > somewhat related and address different aspects of one scenario I > > > see frequently in different variations; so instead of repeating > > > that scenario in slightly modified form in each of the proposals, > > > I'm putting it out here once: > > > > I think you're missing a piece here about how we actually find > > regressions. A lot, it is true, come from test suites running on > > the mainline. > > Sure. > > > However, for obscure drivers and even some more complex > > dependencies, the regression sometimes isn't discovered until it > > gets into the hands of the wider pool of testers, often via stable. > > > > This is important, because it emphasizes that zero regressions in > > stable is impossible (and thus preventing backporting patches that > > cause regressions is also impossible) if stable is the vehicle by > > which some regressions are discovered. > > Of course "Zero regressions in stable is impossible" as we are > dealing with software. ;) And of course even with delayed backport > for non-urgent fixes some problems would make it through. > > But right now users testing mainline sometimes hardly have a chance > to test and report problems with mainline in time to prevent a > backport. Take Linux 6.7.2 (released 2024-01-25 23:58 UTC) with its > 640 changes for example, where users had only 4 days to do so, as > almost all of its changes had been merged for 6.8-rc1 (2024-01-21 > 22:23 UTC). FWIW: 200 of those changes were committed to some > subsystem git tree during January, 363 during December, 70 during > November, and 7 during October. I did make this point here: https://lore.kernel.org/all/7794a2b09ae4fa73ac35fdaec4858145a665efea.camel@HansenPartnership.com/ That merge window fixes should be delayed. Not because I think a longer soak in main would allow us to find many more bugs, simply because it was causing reports in the merge window that weren't handled because people had other things to do. The reply was that they're already doing it and when I looked, they actually started doing it for the 6.9 merge window (so your 6.7 example is probably out of date). > So if those 440 fixes could wait some time to be mainlined and were > not important enough to get into 6.7 (2024-01-07 20:29 UTC) in the > first place, why the rush backporting them to 6.7.y so quickly after > the merge window? > > All that leads to the related question "How many of those changes > maybe should have gone into 6.7?". And maybe even "Should we somehow > try to motivate more people to try -next?". Actually, if we got more people to try mainline we could perhaps find more bugs. Testing -next is problematic because its instability makes things like bisection and update to next release difficult. > But those are different problems. > And the situation regarding the first already got somewhat better > from what I can see -- among others afaics due to me prodding people > when the queue fixes for recent regression for the -next merge > window. Yes, that's why I was asking for stats on 6.9 and 6.10 where this delay policy was apparently in place. > > Plus it also means that a backport > > delay or cadence would actually delay discovery of some regressions > > because the patches that cause them won't be seen by the configs > > that run into them until they get put into stable. > > And why is that a problem? Because a regression we haven't found yet is still a regression. If all we cared about was minimizing the regression stats, we could simply not look for any of them. But we do care about this, so we need to support all our mechanisms for finding them and the point I was making is that one such mechanism is the early backports to stable. There is probably a sweet spot backport delay for regressions we do eventually find in main, but for regressions that others only find in stable (and would never have been found in main however log we delayed) arbitrary delays merely increases the time to finding them. Perhaps one thing we should track with regressions is time to discovery and also ask about ones in stable if they could have been found in mainline? That would give us more data for tuning the backport delay. > > [...] > > > > The other thing I think would help is better tooling and advice to > > help reporters find regressions in stable. What we do a lot > > upstream is ask if they can reproduce it in mainline. However, not > > everyone is equipped to test out mainline kernels, so we could do > > with helping them bisect it in stable > > FWIW Documentation/admin-guide/verify-bugs-and-bisect-regressions.rst > / > https://docs.kernel.org/admin-guide/verify-bugs-and-bisect-regressions.html > covers this: users that notice a regression in a stable tree will > bisect that tree. But before... Some do, but realistically the best others can do is this bug was in X.Y.Z but not in X.Y.Z-1 because they can't build their own kernels. > > (note this can be time dependent: older stable > > trees more naturally give rise to the question "has this been fixed > > upstream" making mainline testing more of an imperative). > > ...it does so, but tells users to try mainline for two reasons: > * It might be fixed there already. > * When Greg receives a regression report for stable he'll usually ask > "is mainline also affected" anyway to figure out if this is something > he or somebody else has to look into. And some of the mainline > developer will ask this, too. Again not saying that's wrong, just saying we must accept that some bugs will only be found in stable and thus we could do with improving our tooling to help stable users pinpoint the backport that caused them. James ^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [MAINTAINERS SUMMIT] [0/4] Common scenario for four proposals regarding regressions 2024-06-20 12:57 ` James Bottomley @ 2024-06-20 13:55 ` Mark Brown 2024-06-20 14:01 ` James Bottomley 2024-06-20 16:59 ` Thorsten Leemhuis 1 sibling, 1 reply; 107+ messages in thread From: Mark Brown @ 2024-06-20 13:55 UTC (permalink / raw) To: James Bottomley; +Cc: Thorsten Leemhuis, ksummit [-- Attachment #1: Type: text/plain, Size: 1014 bytes --] On Thu, Jun 20, 2024 at 08:57:29AM -0400, James Bottomley wrote: > Actually, if we got more people to try mainline we could perhaps find > more bugs. Testing -next is problematic because its instability makes > things like bisection and update to next release difficult. -next is problematic to actually *use* but it's not particularly bad for testing, mostly it's fine but you have to be able to cope with things going bad in you in potentially very bad ways. For testing the stability is generally perfectly fine, and given that the whole goal is to find problems it's hard to see much of an issue. Bisection also works about as well as for mainline - you need to bisect from whatever commit in Linus' tree things were based off (or pending-fixes if you know that was fine) rather than a prior -next tag but otherwise I can't say I notice much difference to mainline. If your tests take more than a day to run then it gets more tricky, but that's just generally harder no matter which tree you're testing. [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 488 bytes --] ^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [MAINTAINERS SUMMIT] [0/4] Common scenario for four proposals regarding regressions 2024-06-20 13:55 ` Mark Brown @ 2024-06-20 14:01 ` James Bottomley 2024-06-20 14:42 ` Mark Brown 0 siblings, 1 reply; 107+ messages in thread From: James Bottomley @ 2024-06-20 14:01 UTC (permalink / raw) To: Mark Brown; +Cc: Thorsten Leemhuis, ksummit [-- Attachment #1: Type: text/plain, Size: 1821 bytes --] On Thu, 2024-06-20 at 14:55 +0100, Mark Brown wrote: > On Thu, Jun 20, 2024 at 08:57:29AM -0400, James Bottomley wrote: > > > Actually, if we got more people to try mainline we could perhaps > > find more bugs. Testing -next is problematic because its > > instability makes things like bisection and update to next release > > difficult. > > -next is problematic to actually *use* but it's not particularly bad > for testing, mostly it's fine but you have to be able to cope with > things going bad in you in potentially very bad ways. For testing > the stability is generally perfectly fine, and given that the whole > goal is to find problems it's hard to see much of an issue. > Bisection also works about as well as for mainline - you need to > bisect from whatever commit in Linus' tree things were based off (or > pending-fixes if you know that was fine) rather than a prior -next > tag but otherwise I can't say I notice much difference to mainline. > > If your tests take more than a day to run then it gets more tricky, > but that's just generally harder no matter which tree you're testing. The difficulty is usually that by the time you get a signal something is wrong, the next tree is different. I agree you can freeze on the next tree you have and hope that the identified commit (by the time you find it) is still in the current version of -next, but there is a non- zero chance it would get rebased which makes testing next a bit more of a chore than testing main, which is why it's tested less often than main Regardless, I don't think -next is a useful tree for the wider pool who usually test stable to try because of all the difficulties. I do think it's not impossible to get some of them to move up to main (after all it's the .0 of stable). James [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 228 bytes --] ^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [MAINTAINERS SUMMIT] [0/4] Common scenario for four proposals regarding regressions 2024-06-20 14:01 ` James Bottomley @ 2024-06-20 14:42 ` Mark Brown 2024-06-20 16:02 ` James Bottomley 0 siblings, 1 reply; 107+ messages in thread From: Mark Brown @ 2024-06-20 14:42 UTC (permalink / raw) To: James Bottomley; +Cc: Thorsten Leemhuis, ksummit [-- Attachment #1: Type: text/plain, Size: 1618 bytes --] On Thu, Jun 20, 2024 at 10:01:57AM -0400, James Bottomley wrote: > On Thu, 2024-06-20 at 14:55 +0100, Mark Brown wrote: > > If your tests take more than a day to run then it gets more tricky, > > but that's just generally harder no matter which tree you're testing. > The difficulty is usually that by the time you get a signal something > is wrong, the next tree is different. I agree you can freeze on the That'd be the tests taking more than a day bit. > next tree you have and hope that the identified commit (by the time you > find it) is still in the current version of -next, but there is a non- > zero chance it would get rebased which makes testing next a bit more of > a chore than testing main, which is why it's tested less often than > main Obviously some trees do rebase, but not constantly and a lot of trees simply don't rebase - carrying things forward to the next day tends to be more of a mild annoyance IME, especially if you remember all the good and bad commits and don't need to restart from scratch. > Regardless, I don't think -next is a useful tree for the wider pool who > usually test stable to try because of all the difficulties. I do think > it's not impossible to get some of them to move up to main (after all > it's the .0 of stable). AFAICT we have a far wider pool of people testing -next than we do testing the stable -rcs at the minute, there's more people trying to *use* stables and finding issues but that's not quite the same thing and I suspect much of the plain testing is going to be qualification for release so it'd be hard to get people to substitute mainline. [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 488 bytes --] ^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [MAINTAINERS SUMMIT] [0/4] Common scenario for four proposals regarding regressions 2024-06-20 14:42 ` Mark Brown @ 2024-06-20 16:02 ` James Bottomley 2024-06-20 17:15 ` Mark Brown ` (2 more replies) 0 siblings, 3 replies; 107+ messages in thread From: James Bottomley @ 2024-06-20 16:02 UTC (permalink / raw) To: Mark Brown; +Cc: Thorsten Leemhuis, ksummit [-- Attachment #1: Type: text/plain, Size: 2725 bytes --] On Thu, 2024-06-20 at 15:42 +0100, Mark Brown wrote: > On Thu, Jun 20, 2024 at 10:01:57AM -0400, James Bottomley wrote: > > On Thu, 2024-06-20 at 14:55 +0100, Mark Brown wrote: > > > > If your tests take more than a day to run then it gets more > > > tricky, but that's just generally harder no matter which tree > > > you're testing. > > > The difficulty is usually that by the time you get a signal > > something is wrong, the next tree is different. I agree you can > > freeze on the > > That'd be the tests taking more than a day bit. Depends ... we might be using different terms. I think of testing as simply finding the bug. After that there's usually a whole load of work to pinpoint the commit that caused it, so even if a test only takes say 30 minutes to run, the bisection can take over a day. > > next tree you have and hope that the identified commit (by the time > > you find it) is still in the current version of -next, but there is > > a non-zero chance it would get rebased which makes testing next a > > bit more of a chore than testing main, which is why it's tested > > less often than main > > Obviously some trees do rebase, but not constantly and a lot of trees > simply don't rebase - carrying things forward to the next day tends > to be more of a mild annoyance IME, especially if you remember all > the good and bad commits and don't need to restart from scratch. I agree that -next is mostly an unstable tree built from reasonably stable branches, yes. > > Regardless, I don't think -next is a useful tree for the wider pool > > who usually test stable to try because of all the difficulties. I > > do think it's not impossible to get some of them to move up to main > > (after all it's the .0 of stable). > > AFAICT we have a far wider pool of people testing -next than we do > testing the stable -rcs at the minute, there's more people trying to > *use* stables and finding issues but that's not quite the same thing > and I suspect much of the plain testing is going to be qualification > for release so it'd be hard to get people to substitute mainline. Right, but the point I'm making is that even that wider pool doesn't have the app use or hardware breadth of the pool who try out stable. I also agree the stable users would rather not be testers but given that they are, it's not impossible we could sell them on the idea of testing out .0 to find bugs they would otherwise be finding in .n. After all, given that stable is now delaying backports in the merge window, there should be at least a 2 week period where .0 is it (although it's also the two week period where we're not paying attention ...) James [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 228 bytes --] ^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [MAINTAINERS SUMMIT] [0/4] Common scenario for four proposals regarding regressions 2024-06-20 16:02 ` James Bottomley @ 2024-06-20 17:15 ` Mark Brown 2024-06-20 23:25 ` Sasha Levin [not found] ` <20240625175131.672d14a4@rorschach.local.home> 2 siblings, 0 replies; 107+ messages in thread From: Mark Brown @ 2024-06-20 17:15 UTC (permalink / raw) To: James Bottomley; +Cc: Thorsten Leemhuis, ksummit [-- Attachment #1: Type: text/plain, Size: 2815 bytes --] On Thu, Jun 20, 2024 at 12:02:21PM -0400, James Bottomley wrote: > On Thu, 2024-06-20 at 15:42 +0100, Mark Brown wrote: > > On Thu, Jun 20, 2024 at 10:01:57AM -0400, James Bottomley wrote: > > > On Thu, 2024-06-20 at 14:55 +0100, Mark Brown wrote: > > > > If your tests take more than a day to run then it gets more > > > > tricky, but that's just generally harder no matter which tree > > > > you're testing. > > > The difficulty is usually that by the time you get a signal > > > something is wrong, the next tree is different. I agree you can > > > freeze on the > > That'd be the tests taking more than a day bit. > Depends ... we might be using different terms. I think of testing as > simply finding the bug. After that there's usually a whole load of > work to pinpoint the commit that caused it, so even if a test only > takes say 30 minutes to run, the bisection can take over a day. Sure, but unless the tree with the issue rebases constantly so long as you can bisect into the tree and then some within a day that's not going to stop progress (and a lot of the time just finishing the bisect and then validating on today's -next is fine). IME the effort with -next is worth it for the turnaround time, it's a lot easier to get attention on recently merged patches. > > > Regardless, I don't think -next is a useful tree for the wider pool > > > who usually test stable to try because of all the difficulties. I > > > do think it's not impossible to get some of them to move up to main > > > (after all it's the .0 of stable). > > AFAICT we have a far wider pool of people testing -next than we do > > testing the stable -rcs at the minute, there's more people trying to > > *use* stables and finding issues but that's not quite the same thing > > and I suspect much of the plain testing is going to be qualification > > for release so it'd be hard to get people to substitute mainline. > Right, but the point I'm making is that even that wider pool doesn't > have the app use or hardware breadth of the pool who try out stable. I > also agree the stable users would rather not be testers but given that > they are, it's not impossible we could sell them on the idea of testing > out .0 to find bugs they would otherwise be finding in .n. I suspect you'll find that a lot of the people who have the capacity and engagement to do that are already doing so. > After all, given that stable is now delaying backports in the merge > window, there should be at least a 2 week period where .0 is it > (although it's also the two week period where we're not paying > attention ...) Yeah, and it also depends on people being able to easily run mainline which if for example people are carrying out of tree patches might be a bit of an issue. [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 488 bytes --] ^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [MAINTAINERS SUMMIT] [0/4] Common scenario for four proposals regarding regressions 2024-06-20 16:02 ` James Bottomley 2024-06-20 17:15 ` Mark Brown @ 2024-06-20 23:25 ` Sasha Levin 2024-06-21 6:33 ` Thorsten Leemhuis [not found] ` <20240625175131.672d14a4@rorschach.local.home> 2 siblings, 1 reply; 107+ messages in thread From: Sasha Levin @ 2024-06-20 23:25 UTC (permalink / raw) To: James Bottomley; +Cc: Mark Brown, Thorsten Leemhuis, ksummit On Thu, Jun 20, 2024 at 12:02:21PM -0400, James Bottomley wrote: >Right, but the point I'm making is that even that wider pool doesn't >have the app use or hardware breadth of the pool who try out stable. I >also agree the stable users would rather not be testers but given that >they are, it's not impossible we could sell them on the idea of testing >out .0 to find bugs they would otherwise be finding in .n. > >After all, given that stable is now delaying backports in the merge >window, there should be at least a 2 week period where .0 is it >(although it's also the two week period where we're not paying >attention ...) We also keep the prior kernel alive for a few weeks *after* a merge window. We understand that X.Y.Z for Z<~5 kernels receive many changes and need additional testing, and so users have the option of staying on the Y-1 kernel for a few weeks until issues with X.Y are settled. So yes, users should have "at least two", but really "at least five" weeks to find out issues in a post-merge-window release. -- Thanks, Sasha ^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [MAINTAINERS SUMMIT] [0/4] Common scenario for four proposals regarding regressions 2024-06-20 23:25 ` Sasha Levin @ 2024-06-21 6:33 ` Thorsten Leemhuis 0 siblings, 0 replies; 107+ messages in thread From: Thorsten Leemhuis @ 2024-06-21 6:33 UTC (permalink / raw) To: Sasha Levin, James Bottomley; +Cc: Mark Brown, ksummit On 21.06.24 01:25, Sasha Levin wrote: > On Thu, Jun 20, 2024 at 12:02:21PM -0400, James Bottomley wrote: >> Right, but the point I'm making is that even that wider pool doesn't >> have the app use or hardware breadth of the pool who try out stable. I >> also agree the stable users would rather not be testers but given that >> they are, it's not impossible we could sell them on the idea of testing >> out .0 to find bugs they would otherwise be finding in .n. >> >> After all, given that stable is now delaying backports in the merge >> window, there should be at least a 2 week period where .0 is it >> (although it's also the two week period where we're not paying >> attention ...) > > We also keep the prior kernel alive for a few weeks *after* a merge > window. > > We understand that X.Y.Z for Z<~5 kernels receive many changes and need > additional testing, and so users have the option of staying on the Y-1 > kernel for a few weeks until issues with X.Y are settled. > > So yes, users should have "at least two", but really "at least five" > weeks to find out issues in a post-merge-window release. Hmmm. Is it really "at least five"? 6.8.12 was the last 6.8.y release and it came out on 2024-05-30 7:59 UTC in parallel to the 6.9.3 release I mentioned in another mail yesterday -- and thus also just three and a half days after 6.10-rc1 was out. And from what I see it also contained 384 patches from the 6.10 merge window where I wonder how much testing they have seen during that short time-frame. FWIW, the above and other things I said yesterday may sound like I'm complaining about the way the stable maintainers work. But to be clear: Given the circumstances I understand why things are as they are. But to reduce the risk of regressions in stable trees I wonder if we can improve the circumstances somewhat, so that the non-urgent patches among those 384 changes never would have made it to 6.8.y in the first place -- and only make it to 6.9.y (and earlier longterm series) once they saw more testing in mainline. Like at least 50 among those 384 changes that were committed to some subsystem tree during February and March and therefore took weeks to get mainlined -- and thus are unlikely to be urgent or crucial (or should have been mainlined way earlier in the first place). I guess the same applies to many or all of the 189 changes committed to some tree during April, too. For those from may it's harder to say without a way to mark the non-urgent ones. Or do something entirely different. Like "only backport changes quickly that have a stable tag; everything that has a Fixes: tag is only backported after it has been in at least three -rc (IOW: two week), unless someone asked for a quicker backport". But that way we risk that some urgent fixes lacking a stable tag take too long to get backported. That sounds like a worse idea to me. #sigh Ciao, Thorsten ^ permalink raw reply [flat|nested] 107+ messages in thread
[parent not found: <20240625175131.672d14a4@rorschach.local.home>]
* Re: [MAINTAINERS SUMMIT] [0/4] Common scenario for four proposals regarding regressions [not found] ` <20240625175131.672d14a4@rorschach.local.home> @ 2024-06-26 7:36 ` Greg KH 2024-06-26 18:32 ` Steven Rostedt 2024-07-25 10:14 ` Thorsten Leemhuis 0 siblings, 2 replies; 107+ messages in thread From: Greg KH @ 2024-06-26 7:36 UTC (permalink / raw) To: Steven Rostedt Cc: James Bottomley, Mark Brown, Thorsten Leemhuis, ksummit, Sasha Levin On Tue, Jun 25, 2024 at 05:51:31PM -0400, Steven Rostedt wrote: > On Thu, 20 Jun 2024 12:02:21 -0400 > James Bottomley <James.Bottomley@HansenPartnership.com> wrote: > > > After all, given that stable is now delaying backports in the merge > > window, there should be at least a 2 week period where .0 is it > > (although it's also the two week period where we're not paying > > attention ...) > > I'm curious. Is there a stable branch that adds the stable patches in > continuously? That is, during the merge window, to have a branch that > adds the stable patches as they come in and then when the merge window > closes, to post the rc series with all the patches that have landed in > that branch? Yes, it's in the stable-queue git tree. And in the linux-stable-rc tree for those that can not consume quilt trees. Been there for years... thanks, greg k-h ^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [MAINTAINERS SUMMIT] [0/4] Common scenario for four proposals regarding regressions 2024-06-26 7:36 ` Greg KH @ 2024-06-26 18:32 ` Steven Rostedt 2024-06-26 19:05 ` James Bottomley 2024-07-25 10:14 ` Thorsten Leemhuis 1 sibling, 1 reply; 107+ messages in thread From: Steven Rostedt @ 2024-06-26 18:32 UTC (permalink / raw) To: Greg KH Cc: James Bottomley, Mark Brown, Thorsten Leemhuis, ksummit, Sasha Levin On Wed, 26 Jun 2024 09:36:22 +0200 Greg KH <gregkh@linuxfoundation.org> wrote: > > I'm curious. Is there a stable branch that adds the stable patches in > > continuously? That is, during the merge window, to have a branch that > > adds the stable patches as they come in and then when the merge window > > closes, to post the rc series with all the patches that have landed in > > that branch? > > Yes, it's in the stable-queue git tree. And in the linux-stable-rc tree > for those that can not consume quilt trees. Been there for years... > Perhaps we should be encouraging people to download the linux-stable-rc and start testing that more? Just because it's been there for years, doesn't mean people are aware of it. -- Steve ^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [MAINTAINERS SUMMIT] [0/4] Common scenario for four proposals regarding regressions 2024-06-26 18:32 ` Steven Rostedt @ 2024-06-26 19:05 ` James Bottomley 0 siblings, 0 replies; 107+ messages in thread From: James Bottomley @ 2024-06-26 19:05 UTC (permalink / raw) To: Steven Rostedt, Greg KH Cc: Mark Brown, Thorsten Leemhuis, ksummit, Sasha Levin On Wed, 2024-06-26 at 14:32 -0400, Steven Rostedt wrote: > On Wed, 26 Jun 2024 09:36:22 +0200 > Greg KH <gregkh@linuxfoundation.org> wrote: > > > > I'm curious. Is there a stable branch that adds the stable > > > patches in continuously? That is, during the merge window, to > > > have a branch that adds the stable patches as they come in and > > > then when the merge window closes, to post the rc series with all > > > the patches that have landed in that branch? > > > > Yes, it's in the stable-queue git tree. And in the linux-stable-rc > > tree for those that can not consume quilt trees. Been there for > > years... > > > > Perhaps we should be encouraging people to download the linux-stable- > rc and start testing that more? Well, that was a note in the original top post of this thread (second paragraph from the bottom): https://lore.kernel.org/all/54f26c0959f796c52f04da9e831899f6482686ac.camel@HansenPartnership.com/ > Just because it's been there for years, doesn't mean people are aware > of it. The observation I made that no-one challenged is that no-one really tests stable-rc trees. I also asked if we should promote it more, but I really think stable itself is good enough and it would only cause confusion if we promoted an additional less stable stable tree. James ^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [MAINTAINERS SUMMIT] [0/4] Common scenario for four proposals regarding regressions 2024-06-26 7:36 ` Greg KH 2024-06-26 18:32 ` Steven Rostedt @ 2024-07-25 10:14 ` Thorsten Leemhuis 2024-07-25 13:14 ` Greg KH 1 sibling, 1 reply; 107+ messages in thread From: Thorsten Leemhuis @ 2024-07-25 10:14 UTC (permalink / raw) To: Greg KH, Steven Rostedt; +Cc: James Bottomley, Mark Brown, ksummit, Sasha Levin On 26.06.24 09:36, Greg KH wrote: > On Tue, Jun 25, 2024 at 05:51:31PM -0400, Steven Rostedt wrote: >> On Thu, 20 Jun 2024 12:02:21 -0400 >> James Bottomley <James.Bottomley@HansenPartnership.com> wrote: >> >>> After all, given that stable is now delaying backports in the merge >>> window, there should be at least a 2 week period where .0 is it >>> (although it's also the two week period where we're not paying >>> attention ...) >> >> I'm curious. Is there a stable branch that adds the stable patches in >> continuously? That is, during the merge window, to have a branch that >> adds the stable patches as they come in and then when the merge window >> closes, to post the rc series with all the patches that have landed in >> that branch? > > Yes, it's in the stable-queue git tree. And in the linux-stable-rc tree > for those that can not consume quilt trees. Been there for years... Out of curiosity, as I seem to be missing something here: Steven afaics asked for "continuously […] during the merge window" and the answer apparently made a few people (including myself) happy. But I can't see anything like that. Were you just busy with other stuff this merge window and didn't get around to pick up the changes, or did I look at the wrong place? I occasionally kept an eye on the trees you mentioned during in the past few days and the branches/directories for 6.10.y in stable-queue and the linux-stable-rc afaics have afaics been non existent until a few days ago before you started to prepare 6.10.1 -- and since that one was released ~20 hours ago those branches/directories do not contain any additional changes. https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git/log/?h=linux-6.10.y https://git.kernel.org/pub/scm/linux/kernel/git/stable/stable-queue.git/tree/ https://git.kernel.org/pub/scm/linux/kernel/git/stable/stable-queue.git/tree/pending I assume that will change over the next few days shortly before or after the merge window -- at least that how I remember it from previous cycles. But until then there is afaics no way to test the current stack of changes that likely will end up in one of the next 6.10.y releases -- or is there and I just missed it somehow? Ciao, Thorsten ^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [MAINTAINERS SUMMIT] [0/4] Common scenario for four proposals regarding regressions 2024-07-25 10:14 ` Thorsten Leemhuis @ 2024-07-25 13:14 ` Greg KH 0 siblings, 0 replies; 107+ messages in thread From: Greg KH @ 2024-07-25 13:14 UTC (permalink / raw) To: Thorsten Leemhuis Cc: Steven Rostedt, James Bottomley, Mark Brown, ksummit, Sasha Levin On Thu, Jul 25, 2024 at 12:14:34PM +0200, Thorsten Leemhuis wrote: > On 26.06.24 09:36, Greg KH wrote: > > On Tue, Jun 25, 2024 at 05:51:31PM -0400, Steven Rostedt wrote: > >> On Thu, 20 Jun 2024 12:02:21 -0400 > >> James Bottomley <James.Bottomley@HansenPartnership.com> wrote: > >> > >>> After all, given that stable is now delaying backports in the merge > >>> window, there should be at least a 2 week period where .0 is it > >>> (although it's also the two week period where we're not paying > >>> attention ...) > >> > >> I'm curious. Is there a stable branch that adds the stable patches in > >> continuously? That is, during the merge window, to have a branch that > >> adds the stable patches as they come in and then when the merge window > >> closes, to post the rc series with all the patches that have landed in > >> that branch? > > > > Yes, it's in the stable-queue git tree. And in the linux-stable-rc tree > > for those that can not consume quilt trees. Been there for years... > > Out of curiosity, as I seem to be missing something here: > > Steven afaics asked for "continuously […] during the merge window" and > the answer apparently made a few people (including myself) happy. But I > can't see anything like that. Were you just busy with other stuff this > merge window and didn't get around to pick up the changes, or did I look > at the wrong place? Ah, no, I read this wrong. THere is no such tree that happens "during the merge window", sorry, we are off doing other merge-window work and not queueing up fixes then. > I occasionally kept an eye on the trees you mentioned during in the past > few days and the branches/directories for 6.10.y in stable-queue and the > linux-stable-rc afaics have afaics been non existent until a few days > ago before you started to prepare 6.10.1 -- and since that one was > released ~20 hours ago those branches/directories do not contain any > additional changes. You are correct, I missed the "during merge window" portion. thanks, greg k-h ^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [MAINTAINERS SUMMIT] [0/4] Common scenario for four proposals regarding regressions 2024-06-20 12:57 ` James Bottomley 2024-06-20 13:55 ` Mark Brown @ 2024-06-20 16:59 ` Thorsten Leemhuis 2024-06-20 23:18 ` Sasha Levin 1 sibling, 1 reply; 107+ messages in thread From: Thorsten Leemhuis @ 2024-06-20 16:59 UTC (permalink / raw) To: James Bottomley, ksummit On 20.06.24 14:57, James Bottomley wrote: > On Thu, 2024-06-20 at 12:32 +0200, Thorsten Leemhuis wrote: >> On 18.06.24 16:43, James Bottomley wrote: >>> On Thu, 2024-06-13 at 10:22 +0200, Thorsten Leemhuis wrote: > >>> However, for obscure drivers and even some more complex >>> dependencies, the regression sometimes isn't discovered until it >>> gets into the hands of the wider pool of testers, often via stable. >>> >>> This is important, because it emphasizes that zero regressions in >>> stable is impossible (and thus preventing backporting patches that >>> cause regressions is also impossible) if stable is the vehicle by >>> which some regressions are discovered. >> >> Of course "Zero regressions in stable is impossible" as we are >> dealing with software. ;) And of course even with delayed backport >> for non-urgent fixes some problems would make it through. >> >> But right now users testing mainline sometimes hardly have a chance >> to test and report problems with mainline in time to prevent a >> backport. Take Linux 6.7.2 (released 2024-01-25 23:58 UTC) with its >> 640 changes for example, where users had only 4 days to do so, as >> almost all of its changes had been merged for 6.8-rc1 (2024-01-21 >> 22:23 UTC). FWIW: 200 of those changes were committed to some >> subsystem git tree during January, 363 during December, 70 during >> November, and 7 during October. > > I did make this point here: > > https://lore.kernel.org/all/7794a2b09ae4fa73ac35fdaec4858145a665efea.camel@HansenPartnership.com/ > > That merge window fixes should be delayed. Not because I think a > longer soak in main would allow us to find many more bugs, simply > because it was causing reports in the merge window that weren't handled > because people had other things to do. The reply was that they're > already doing it Only for changes picked by autosel afaics, which are delayed for a while already, yes (not sure, I think that was in place even before the 6.7 days). But I'm pretty sure it was not autosel that resulted in most of those backports that went into 6.7.2 due to the lack of "patch autosel" messags for those changes. Those changes afaics were mainly patches with a stable tag (about 94 from a quick check) or a Fixes: tag (630); some had both. And those tags (and not autosel) afaics were the reason for the backports. > and when I looked, they actually started doing it for > the 6.9 merge window (so your 6.7 example is probably out of date). Yes, things looked differently for those releases iirc. We would need to ask Greg why; but from what I saw it looks a lot like "Greg was on vacation and/or busy with other stuff" that slightly mixed things up. >> But those are different problems. >> And the situation regarding the first already got somewhat better >> from what I can see -- among others afaics due to me prodding people >> when the queue fixes for recent regression for the -next merge >> window. > Yes, that's why I was asking for stats on 6.9 and 6.10 where this delay > policy was apparently in place. v6.9.2..v6.9.3: 427 changes, all from the 6.10 merge window. From a rough check if seems 41 of them have a stable tag and 407 a fixes tag (some both). 6.9.3 was released on "2024-05-30 7:58 UTC", so not even four days after 6.10-rc1 went out on "2024-05-26 22:31 UTC". IOW: less patches then in the 6.7.2 case, but eben less time for users to test mainline, bisect, and report regression to prevent a backport in time. Ciao, Thorsten (who prays he did not do something stupid while generating those numbers) ^ permalink raw reply [flat|nested] 107+ messages in thread
* Re: [MAINTAINERS SUMMIT] [0/4] Common scenario for four proposals regarding regressions 2024-06-20 16:59 ` Thorsten Leemhuis @ 2024-06-20 23:18 ` Sasha Levin 0 siblings, 0 replies; 107+ messages in thread From: Sasha Levin @ 2024-06-20 23:18 UTC (permalink / raw) To: Thorsten Leemhuis; +Cc: James Bottomley, ksummit On Thu, Jun 20, 2024 at 06:59:56PM +0200, Thorsten Leemhuis wrote: >On 20.06.24 14:57, James Bottomley wrote: >> On Thu, 2024-06-20 at 12:32 +0200, Thorsten Leemhuis wrote: >>> On 18.06.24 16:43, James Bottomley wrote: >>>> On Thu, 2024-06-13 at 10:22 +0200, Thorsten Leemhuis wrote: >> >>>> However, for obscure drivers and even some more complex >>>> dependencies, the regression sometimes isn't discovered until it >>>> gets into the hands of the wider pool of testers, often via stable. >>>> >>>> This is important, because it emphasizes that zero regressions in >>>> stable is impossible (and thus preventing backporting patches that >>>> cause regressions is also impossible) if stable is the vehicle by >>>> which some regressions are discovered. >>> >>> Of course "Zero regressions in stable is impossible" as we are >>> dealing with software. ;) And of course even with delayed backport >>> for non-urgent fixes some problems would make it through. >>> >>> But right now users testing mainline sometimes hardly have a chance >>> to test and report problems with mainline in time to prevent a >>> backport. Take Linux 6.7.2 (released 2024-01-25 23:58 UTC) with its >>> 640 changes for example, where users had only 4 days to do so, as >>> almost all of its changes had been merged for 6.8-rc1 (2024-01-21 >>> 22:23 UTC). FWIW: 200 of those changes were committed to some >>> subsystem git tree during January, 363 during December, 70 during >>> November, and 7 during October. >> >> I did make this point here: >> >> https://lore.kernel.org/all/7794a2b09ae4fa73ac35fdaec4858145a665efea.camel@HansenPartnership.com/ >> >> That merge window fixes should be delayed. Not because I think a >> longer soak in main would allow us to find many more bugs, simply >> because it was causing reports in the merge window that weren't handled >> because people had other things to do. The reply was that they're >> already doing it > >Only for changes picked by autosel afaics, which are delayed for a while >already, yes (not sure, I think that was in place even before the 6.7 days). All merge window content deemed for -stable only makes it into -stable once -rc1 is released. This isn't specific for autosel. >But I'm pretty sure it was not autosel that resulted in most of those >backports that went into 6.7.2 due to the lack of "patch autosel" >messags for those changes. > >Those changes afaics were mainly patches with a stable tag (about 94 >from a quick check) or a Fixes: tag (630); some had both. And those tags >(and not autosel) afaics were the reason for the backports. Right, it was all commits with a stable/fixes tag. autosel usually starts around ~.6. >> and when I looked, they actually started doing it for >> the 6.9 merge window (so your 6.7 example is probably out of date). > >Yes, things looked differently for those releases iirc. We would need to >ask Greg why; but from what I saw it looks a lot like "Greg was on >vacation and/or busy with other stuff" that slightly mixed things up. Greg can probably give a more accurate response, but we usually target stable tagged commits from the merge window after Linus cuts -rc1. Sometimes it ends up being later because of vacation/overload/etc. >>> But those are different problems. >>> And the situation regarding the first already got somewhat better >>> from what I can see -- among others afaics due to me prodding people >>> when the queue fixes for recent regression for the -next merge >>> window. >> Yes, that's why I was asking for stats on 6.9 and 6.10 where this delay >> policy was apparently in place. > >v6.9.2..v6.9.3: 427 changes, all from the 6.10 merge window. From a >rough check if seems 41 of them have a stable tag and 407 a fixes tag >(some both). > >6.9.3 was released on "2024-05-30 7:58 UTC", so not even four days >after 6.10-rc1 went out on "2024-05-26 22:31 UTC". > >IOW: less patches then in the 6.7.2 case, but eben less time for users >to test mainline, bisect, and report regression to prevent a backport in >time. There's no "win" here: either our release cycles are short and we can have relatively few commits in each release, or our release cycles are longer and we end with thousands of commits each time. From experience, it's easier to work with smaller releases, both for us as we compose and release these kernels, but also for testers since it's way easier to spot issues in a smaller release. -- Thanks, Sasha ^ permalink raw reply [flat|nested] 107+ messages in thread
end of thread, other threads:[~2024-09-12 13:34 UTC | newest]
Thread overview: 107+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-06-13 8:22 [MAINTAINERS SUMMIT] [0/4] Common scenario for four proposals regarding regressions Thorsten Leemhuis
2024-06-13 8:26 ` [MAINTAINERS SUMMIT] [1/4] Create written down guidelines for handling regressions Thorsten Leemhuis
2024-09-12 13:33 ` Thorsten Leemhuis
2024-06-13 8:32 ` [MAINTAINERS SUMMIT] [2/4] Ensure recent mainline regression are fixed in latest stable series Thorsten Leemhuis
2024-06-13 11:02 ` Johannes Berg
2024-06-13 11:21 ` Greg KH
2024-06-13 13:18 ` Sasha Levin
2024-06-13 11:17 ` Jiri Kosina
2024-06-13 11:28 ` Laurent Pinchart
2024-06-14 0:50 ` Steven Rostedt
2024-06-14 14:01 ` Mark Brown
2024-06-14 14:32 ` Rafael J. Wysocki
2024-06-13 8:34 ` [MAINTAINERS SUMMIT] [3/4] Elevate handling of regressions that made it to releases deemed for end users Thorsten Leemhuis
2024-06-13 11:34 ` Laurent Pinchart
2024-06-13 11:39 ` Jiri Kosina
2024-06-14 14:10 ` Mark Brown
2024-06-18 12:58 ` Thorsten Leemhuis
2024-06-19 20:25 ` Laurent Pinchart
2024-06-20 10:47 ` Thorsten Leemhuis
2024-06-13 15:56 ` Liam R. Howlett
2024-06-18 12:24 ` Thorsten Leemhuis
2024-06-20 13:20 ` Jani Nikula
2024-06-20 13:35 ` Thorsten Leemhuis
2024-06-20 14:16 ` Mark Brown
2024-06-21 6:47 ` Jiri Kosina
2024-06-21 10:19 ` Thorsten Leemhuis
2024-06-13 8:42 ` [MAINTAINERS SUMMIT] [4/4] Discuss how to better prevent backports of commits that turn out to cause regressions Thorsten Leemhuis
2024-06-13 9:59 ` Jan Kara
2024-06-13 10:18 ` Thorsten Leemhuis
2024-06-13 14:08 ` Konstantin Ryabitsev
2024-06-14 9:19 ` Lee Jones
2024-06-14 9:24 ` Lee Jones
2024-06-14 12:27 ` Konstantin Ryabitsev
2024-06-14 14:26 ` Konstantin Ryabitsev
2024-06-14 14:36 ` Lee Jones
2024-06-14 14:29 ` Michael Ellerman
2024-06-14 14:38 ` Konstantin Ryabitsev
2024-06-14 14:44 ` Rafael J. Wysocki
2024-06-14 15:08 ` Geert Uytterhoeven
2024-06-15 11:29 ` Michael Ellerman
2024-06-17 10:15 ` Jani Nikula
2024-06-17 12:42 ` Geert Uytterhoeven
2024-06-14 15:45 ` Mark Brown
2024-06-14 14:43 ` Mark Brown
2024-06-14 14:51 ` Konstantin Ryabitsev
2024-06-14 15:42 ` Mark Brown
2024-06-14 14:43 ` Steven Rostedt
2024-06-14 14:57 ` Laurent Pinchart
2024-06-16 1:13 ` Linus Torvalds
2024-06-16 3:28 ` Steven Rostedt
2024-06-16 4:59 ` Linus Torvalds
2024-06-16 8:22 ` Paolo Bonzini
2024-06-16 9:05 ` Geert Uytterhoeven
2024-06-16 15:07 ` Steven Rostedt
2024-06-17 13:48 ` Dan Carpenter
2024-06-17 15:23 ` Dan Carpenter
2024-06-17 14:39 ` Konstantin Ryabitsev
2024-06-17 16:04 ` Paul E. McKenney
2024-06-17 16:06 ` Konstantin Ryabitsev
2024-06-17 16:14 ` Paolo Bonzini
2024-06-17 16:18 ` Konstantin Ryabitsev
2024-06-17 17:11 ` Geert Uytterhoeven
2024-06-18 12:05 ` Michael Ellerman
2024-06-16 7:26 ` Takashi Iwai
2024-06-16 8:10 ` Paolo Bonzini
2024-06-16 11:31 ` Laurent Pinchart
2024-06-16 11:39 ` Takashi Iwai
2024-06-16 16:40 ` Linus Torvalds
2024-06-16 8:31 ` Jiri Kosina
2024-06-16 8:54 ` Geert Uytterhoeven
2024-06-13 19:39 ` Dan Carpenter
2024-06-14 1:00 ` Steven Rostedt
2024-06-13 11:58 ` James Bottomley
2024-06-13 13:06 ` Sasha Levin
2024-06-13 13:56 ` James Bottomley
2024-06-13 14:02 ` Greg KH
2024-06-13 15:11 ` James Bottomley
2024-06-13 16:27 ` Greg KH
2024-06-14 18:47 ` Sasha Levin
2024-06-17 10:59 ` Vlastimil Babka
2024-06-13 18:08 ` Sasha Levin
2024-06-13 13:45 ` Greg KH
2024-06-13 13:40 ` Sasha Levin
2024-06-18 13:12 ` Thorsten Leemhuis
2024-06-13 14:28 ` Andrew Lunn
2024-06-13 18:14 ` Sasha Levin
2024-06-14 14:41 ` Jan Kara
2024-06-14 15:03 ` Rafael J. Wysocki
2024-06-14 17:46 ` Sasha Levin
2024-06-18 14:43 ` [MAINTAINERS SUMMIT] [0/4] Common scenario for four proposals regarding regressions James Bottomley
2024-06-18 15:50 ` Mark Brown
2024-06-20 10:32 ` Thorsten Leemhuis
2024-06-20 12:57 ` James Bottomley
2024-06-20 13:55 ` Mark Brown
2024-06-20 14:01 ` James Bottomley
2024-06-20 14:42 ` Mark Brown
2024-06-20 16:02 ` James Bottomley
2024-06-20 17:15 ` Mark Brown
2024-06-20 23:25 ` Sasha Levin
2024-06-21 6:33 ` Thorsten Leemhuis
[not found] ` <20240625175131.672d14a4@rorschach.local.home>
2024-06-26 7:36 ` Greg KH
2024-06-26 18:32 ` Steven Rostedt
2024-06-26 19:05 ` James Bottomley
2024-07-25 10:14 ` Thorsten Leemhuis
2024-07-25 13:14 ` Greg KH
2024-06-20 16:59 ` Thorsten Leemhuis
2024-06-20 23:18 ` Sasha Levin
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox