Hi, Otto, On Apr 09, Otto Kekäläinen via developers wrote:
Hi!
CI bugs are being treated very seriously at the moment via MDEV-33073 always green buildbot, being a Blocker bug that includes all the CI failures that we notice.
Thanks for the reply Daniel, you have always been one of those taking the CI very seriously.
The reason I wrote to the developers mailing list is that I wish to raise this for a wider audience and get input from both core contributors and other contributors.
For example Trevor (CC'd as I am not sure if he is on this list) filed https://github.com/MariaDB/server/pull/2958 which failed in CI. Since the mainline was already failing ("red") and the PR submission showed lots of failing tests, Trevor had to do a lot of extra work figuring out which tests failed due to his changes, and which ones were already broken (which led to 3 separate PRs now in #3075, #3076 and #3077).
I suspect core developers don't suffer from failing CI to the same extent as they simply bypass it, or have much more time on their hands
Nobody can ignore CI failures except for admins. And even for them it's not easy - go to settings, disable branch protection, push, enable branch protection. I doubt they do it often.
and can spend time learning what failures can be ignored which week and month. The fact that the CI is not green seems to be a topic where the core developers are perhaps a bit blind to the bigger picture, while non-core contributors struggle with the extra work it incurs. Also in the eyes of the wider public, a constantly failing CI erodes trust in quality.
As Daniel wrote, there's MDEV-33073 "always green buildbot", and it's a blocker, which means it *will* be done before the next release. Take a look at the 10.5 branch - I've done >30 commits in the last couple of weeks specifically to fix sporadic test failures. This will be merged up soon.
While I understand that the natural reply is "we will get to green soon" and it makes a lot of sense, I am afraid it might be a overly optimistic. We've had in the past recurring the situation that Daniel, Sergei and Monty all say the same week they want to fix all failing tests, but it only lasts for a short while and then we are back to failures on mainline CI.
This is what branch protection is for. It cannot wasn't able to do much as tests were constantly failing. Now it can
Thus, to permanently enforce have CI green on mainline branches I proposed:
I see two approaches to get to consistently green CI:
1) Stop all development and focus on just fixing these, don't continue until CI is fully green, and once it is fully green make the GitHub branch protection settings one notch stricter to not allow any new commits unless the CI is fully green so it never regresses again.
2) Disable these tests and make the rules in GitHub branch protection one notch stricter right away, and not allow any new commits unless the CI is fully green ensuring no new recurring failures are introduced.
What do other developers think about this?
I'm doing both, I fix what I can and disable the rest, creating MDEV's for disabled tests to have them fixed by the corresponding developer. Regards, Sergei Chief Architect, MariaDB Server and security@mariadb.org