Hello Sergei,

as for (2) - I'd say the most important part is to figure out how to
select the subset of tests to run, not how to integrate it in buildbot.

Definitely! I was just reading into buildbot to spend time while I got access to the data! : )
 
I have a bzr repository with the test data (text files, basically
database dumps) and a pretty hairy perl script that processes them and
outputs result files. Then a gnuplot script that reads these result
files and shows graphs. 
 
There were 9 script runs, that is 9 result files and 9 graphs.
I'll attach them too.

The model I used was:

1. For every new revision buildbot runs all tests on all platforms

2. But they are sorted specially. No matter in what order tests are
executed, this doesn't change the result - the set of failed tests.  So,
my script was emulating that, changing the order in which tests were
run. With the goal to have failures happen as early as possible.

That's how to interpret the graphs. For example, on v5 you can see that
more than 90% of all test failures happen within the first 20K tests

Yup, I understand what you mean here... I can grasp the concepts, but I am still having some trouble understanding some of the terms that you use in the graphs. I can see that recall is related to the percentage of failures that have been encountered, and I guess that cutoff has to do with how many files you analyze before starting reordering... Also, I can see by your comments a bit of your thought process, but I have a few questions.

I also have some questions regarding what you did with some of the data, to get some ideas on how to do it myself. Also regarding how the buildbot organizes builds and how they correspond to code changes and to test runs.

If it's not too inconvenient, do you think we could set up a Google Hangout or a Skype call on Monday to go over a few questions quickly?

If you are busy, we can do it through email or on IRC. I can also take a dive into the bzr repo, and look at what you did, as well as read up all the info regarding the data; but it would be real helpful if you could lend me a hand : )

Thank you very much.
Best
Pablo