Hi Pablo, On 23.07.2014 15:51, Pablo Estrada wrote:
Hi Elena,
It's hard to make suggestions without seeing what you currently have,
please let me know when you have pushed the code.
I just finished cleaning up the code with the new implementation, but in any case, the strategy is exactly the same. I have been looking for advice with the strategy.
In any case, I just uploaded the new code: https://github.com/pabloem/Kokiri/tree/core-wrapper_architecture But the strategy of using file correlation is still the same.
Thanks. I hoped you would have results of the experiments involving incoming lists of tests, as I think it's an important factor which might affect the results (and hence the strategy); but I'll look at what we have now.
Could you please explain what you mean by logging into buildbot (and by
more precise data collection via it)? How exactly you are planning to work with buildbot interactively? In the part that concerns our task, buildbot picks up a push, gets it compiled and runs MTR with certain predefined parameters. There isn't really much room for interaction. Possibly I totally misunderstand your question, so please elaborate on it.
What I can do (it also concerns your previous comment about the non-continuous data) is upload a fresh data dump for you; hopefully it will have [almost] all matching logs, so you'll get a consistent chunk of test runs to experiment with.
I mean adding some code that does logging of extra information such as which tests were run on each test_run. This would be the main thing. I understand that the logfiles that you sent me contain this information, but storing them is not scalable, and even with a fresh dump, I'm not sure there would be a continuous set of data. I made a small script that analyzes the matches of the files with the dump from the database, and their matching is quite random.
I will see what we can do about getting reliable lists one or another way; certainly the log files are a temporary solution, but it would be nice to use them for experiments and see the results anyway, because modifying MTR/buildbot tandem and especially collecting the new data of considerable volume will take time.
Towards the end there is more matching, but it still is quite random, and it doesn't seem to have consistent matching for too long:
https://raw.githubusercontent.com/pabloem/random/master/matches.txt
If you observe, close to the end, there is already a continuous set of 20 skipped test runs:
148484: - kvm-bintar-centos5-x86_1066-log-test-stdio Skip 20 148485: - winx64-packages_3203-log-test-stdio
If I interpret your list correctly, you mean that logs for test runs with id between 148464 and 148483 (included) are missing. It's a bit strange. I see logs for the following runs: 148466 - winx64-packages_3170-log-test-stdio 148467 - win32-packages_3172-log-test-stdio 148470 - win-rqg-se_309-log-test-stdio 148471 - kvm-deb-lucid-x86_3313-log-test_4-stdio 148472 - win32-packages_3173-log-test-stdio 148473 - kvm-deb-debian6-amd64_2705-log-test_4-stdio 148474 - winx64-packages_3171-log-test-stdio 148476 - win-rqg-se_310-log-test-stdio 148778 - kvm-deb-debian6-x86_2850-log-test_4-stdio 148481 - win-rqg-se_311-log-test-stdio 148482 - kvm-bintar-centos5-amd64_359-log-test-stdio 148483 - kvm-deb-precise-amd64_2709-log-test_4-stdio This is not to say that parsing logs is the best way to do things, but apparently something went wrong either with my archiving or with your matching. If you don't have these files, please let me know. Now, regarding the misses. 148464, 148475, 148480 are bld-dan-release. For this builder we indeed don't seem to have logs; and the tests are not reliable there, so it should be all right to ignore failures from it. 148465 - that's a miss, something went wrong while storing logs. 148468, 148469, 148477, 148479 - these are real misses, we don't have these logs Most of them should not happen for newer tests. For example, logs for labrador only start from June, while our database dump was from April.
So, what I had suggested was to log more data about each test run e.g. mainly, which tests ran, but as much information as possible.
For now, yes, if you'd be so kind, please upload a fresh dump of the database : )
I've uploaded the fresh dump. Same location, file name buildbot-20140722.dump.gz. Regards, Elena
Regards Pablo