"Philip Stoev" <pstoev@askmonty.org> writes:
We should discuss how do we handle new developments in MariaDB that cause our behavior to diverge from the one in MySQL. Since we claim to be a
Yes, agree. Thank you for bringing it up! Note that this is a general MariaDB discussion, not a Monty Program specific one. So I move the thread to maria-developers@. So far, we have generally been ok at handing incompatible changes. The basic approach is to shape new features to be backwards compatible as much as possible, and to add any extra needed code to handle interoperability. For example, if an option is changed due to a feature, we should keep the old option around and emulate the bahaviour, eventually deprecating it in a future version. Another example, with replication, we check the version of the master or slave at the other end, and take care to only send what the other end understands, and likewise interpret what it sends correctly based on its version. However, this has been mostly up to the individual developer (or often reviewer), and if we can get some more structure/process to this to avoid it being forgotten, then that would be very good.
The particular situation we are facing now is the microsecond precision patch from Serg. As part of introducing precision date and time datatypes, he made some refactoring around the Item_* tree.
As a result, various corner cases, e.g. invalid dates, partially valid dates, invalid conversions between date and time, etc. are now handled differently, with either different warnings being printed (or no warnings at all), or with the result of the expression being different, or with a NULL or 0000-00-00 returned.
For example, mysql reports that DATE ( SEC_TO_TIME( 8 ) ) is equal to "2000-00-08" whereas serg's patch returns "0000-00-00" . In other words, MySQL chose to place the "8" literal into the days portion of the date value, whereas Serg's patch filtered the literal somewhere along the way. There are numerous other such examples.
While in most cases Serg's new behavior is more intuitive and a program that is written to conform to postgresql-level type safety will not be affected either way, it entirely possible that a program written under mysql's relaxed type safety rules, and with MySQL's behavior in mind, would start behaving unexpectedly.
It sounds like in this case, the changes in behaviour could be considered improvements, or even bug fixes, when seen in isolation. And the behaviour they change is not something that an application would deliberately be written to rely on. But since it is a change in behaviour, it has the potential to break applications that by accident happen to depend on such behaviour. My opinion on this is that we should do the change in a major version (eg. 5.2->5.3), but not in a minor version (eg. 5.2.5->5.2.6). And document it in release notes as an incompatible change. Basically the same way we did in MySQL (and still do, I suppose). It is ok to change behaviour in a major version, if it is necessary, and documented. I believe users anticipate such changes, and consequently are more careful about going eg. 5.1->5.5 than 5.1.x->5.1.y. We should not needlessly change things. Every behaviour change should be carefully considered. If it has the potential to seriously break major applications, we should not change it (or use deprecation, compatibility mode, etc). If the change is not necessary or useful, it should probably be left. But if the change makes sense and the new behaviour is much more reasonable, then we should be ready to make the change. We cannot afford to be stuck forever on old poor decisions. Another example is optimiser changes. In my opinion, we cannot totally avoid that improvements to the optimiser will make some corner case queries slower in some cases. The optimiser has only limited knowledge of the data and application. If, based on this knowledge, execution plan A is better than execution plan B, then we should be prepared to change (in a major version) the code to select A over B. This will invaribly cause slowdown in some cases, where the knowledge is incomplete. But it is important to be able to make improvements. But again, we need to consider each case carefully. We should not do things that are obviously stupid (eg. do not select table scan over range scan unless _really_ sure it will be better), and we should avoid breaking major applications or use patterns, even if it theoretically makes sense. So summary of my opinion: We should be very careful about incompatible changes. We should generally avoid them unless there is a very good reason for them. We should create backward compatibility where possible and where it makes sense. And if we do change behaviour, we should only do it in a major version and document it clearly in an "incompatible changes" list. And being able to continually improve the server long-term is important.
So, how we should handle this particular case, and more importantly, how do we handle future cases like that in a stable and predictable manner? I understand that in the early days of MySQL it would have been a no-brainer to change everything as needed, we are by now dealing with a huge installed base and powerful competitors, so we can not simply apply the happy hacking approach that was used in the 1990s.
Agree. No "happy hacking" approach to incompatible changes! - Kristian.