Hi, Pablo! To add to my last reply: On Jun 03, Sergei Golubchik wrote:
Well, it's your project, you can keep any measure you want. But please mark clearly (in comments or whatever) what factors affect results and what don't.
It would be very useful to be able to see the simplest possible model that still delivers reasonably good results. Even if we'll decide to use something more complicated at the end. ... Same as above, basically. I'd prefer to use not the model that simply "looks realistic", but the one that makes best predictions.
You can use whatever criteria you prefer, but if taking into account changed tests will not improve the results, I'd like it to be to clearly documented or visible in the code.
Alternatively, you can deliver (when the this GSoC project ends) two versions of the script - one with anything you want in it, and the second one - as simple as possible. For example, the only really important metric is the "recall as a function of total testing` time". We want to reach as high recall as possible in the shortest possible testing time, right? But according to this criteria one needs to take into account individual test execution times (it's better to run 5 fast tests than 1 slow test) and individual builder speed factors (better to run 10 tests on a fast builder than 5 tests on a slow builder). And in my tests it turned out that these complications don't improve results much. So, while they make perfect sense and make the model more realistic, the simple model can perfectly survive without them and use "recall vs. number of tests" metric. Regards, Sergei