Hello everyone,
Here's a small report on the news that I have so far:
- I had a slow couple of weeks because of a quick holiday that I took. I will make up for that.
- I added the metric that considers number of test_runs since a test_case ran for the last time. I graphed it, and it does not affect the results much at all. -I still think this is useful to uncover hidden bugs that might lurk in the code for a long time; but testing this condition is difficult with our data. I would like to keep this measure, specially since it doesn't seem to affect results negatively. Opinions?
- I liked Sergei's idea about using the changes on the test files to calculate the relevancy index. If a test has been changed recently, its relevancy index should be high. This is also more realistic, and uses information that it's easy for us to figure out.
- I am trying to match the change_files table or the mysql-test directory with the test_failure table. I was confused about the name of tests and test suites, but I am making progress on it. Once I am able to match at least 90% of the test_names in test_failure with the filename in the change_files table, I will incorporate this data into the code and see how it works out.
- Question: Looking at the change_files table, there are files that have been ADDED several times. Why would this be? Maybe when a new branch is created, all files are ADDED to it? Any ideas? : ) (If no one knows, I'll figure it out, but maybe you do know ; ))
- I uploaded inline comments for my code last week, let me know if it's clear enough. You can start by run_basic_simulations.py, where the most important functions are called... and after, you can dive into basic_simulator.py, where the simulation is actually done. The repository is a bit messy, I admit. I'll clean it up in the following commits.
This is all I have to report for now. Any advice on the way I'm proceeding is welcome : )
Have a nice week, everyone.
Pablo