
Hi Otto, On Wed, Apr 30, 2025 at 12:02 AM Otto Kekäläinen via developers < developers@lists.mariadb.org> wrote: ... I see there is now a lot of activity in
There is some activity indeed. Scope of this effort is to be defined. My initial aim was fixing failures that affect builders participating in pull request checks. The way I see it currently: every month identify 10 or so most annoying sporadic failures and make sure they get fixed within a month. Currently I ask Elena to make report based on buildbot logs, although I'm happy to accept lists from my fellow colleagues. The failure has to be sporadic and has to fail often (a few times a day at least) according to bb cross-reference. Since this issue was created (Apr 19) 4 issues were fixed, 1 reviewed, many were analysed. Hope I will have enough resources to keep it rolling.
Having extra focus on this right now is of course good and helps make MariaDB 11.8 a solid release. Hopefully also people have some ideas brewing on potential process/policy changes so that it is clear what developers should do in the future if/when tests regress and GitHub published CI results go from green to red next time.
My thoughts... There are hundreds of sporadic failures in our tests. Some fail once a year, other - every hour. Some are test issues, other - real bugs. The more often the test is failing the higher priority it should receive. Previous attempts of similar efforts were essential to keep things in decent state. However they didn't seem to follow systematic approach of prioritizing frequent failures. There has to be somebody who is constantly monitoring the situation and approaches appropriate developers. It is inefficient to expect it from all team members. It feels better if the team is concentrated on their assignments. Normally CI doesn't go red badly anymore. Large part of MariaDB core developers have switched towards using github pull requests for their daily job. Which means they have to pass CI checks before they can push. There are exceptions though, when commits get pushed directly and they do bring persistent failures (which are usually fixed promptly). We should get this fixed indeed. Let's see how this effort goes and adjust accordingly. Regards, Sergey