[Maria-developers] Thoughts on a variable to control high resolution temporal type format and rounding behavior?
Sergei, In the process of working through the bugs I've reported re. time formats and InnoDB storage formats we noted that there isn't currently any way to cause MariaDB to create a *new* Oracle MySQL 5.6-format TIMESTAMP/TIME/DATETIME field, and thus there's not really very good test coverage of them. Additionally the existence of this difference relative to Oracle MySQL is not really desirable. I would propose adding a new sysvar to control the temporal time format e.g. temporal_microsecond_format={mysql|mariadb} which controls whether newly created columns are created as Field_*_hires or Field_*f. This would especially allow: 1. Tests to be run in both Oracle MySQL and MariaDB modes. 2. A single test to create a column in one format and request conversion to the other. 3. A user to convert the format of their existing tables for compatibility. Additionally I would suggest a new sysvar to control the rounding behavior of the same temporal types, e.g. temporal_microsecond_rounding={round|truncate} which controls whether to round (as Oracle MySQL 5.6 does) or to truncate (as MariaDB does) when faced with fitting a larger number of decimals in a value into a smaller number decimals in a field. (This of course assumes that the current MariaDB behavior and divergence from Oracle MySQL behavior is intentional, and not a bug. Even if it is considered a bug it could probably not be changed outright at this point and the sysvar would still be needed.) Any thoughts on either of these? We would of course be willing to do the work and provide the patches for them. I just wanted to get your general feelings on the ideas. Regards, Jeremy
Hi Jeremy, On 11/12/2013 07:09 AM, Jeremy Cole wrote:
Sergei,
In the process of working through the bugs I've reported re. time formats and InnoDB storage formats we noted that there isn't currently any way to cause MariaDB to create a *new* Oracle MySQL 5.6-format TIMESTAMP/TIME/DATETIME field, and thus there's not really very good test coverage of them. Additionally the existence of this difference relative to Oracle MySQL is not really desirable.
I would propose adding a new sysvar to control the temporal time format e.g. temporal_microsecond_format={mysql|mariadb} which controls whether newly created columns are created as Field_*_hires or Field_*f. This would especially allow:
1. Tests to be run in both Oracle MySQL and MariaDB modes. 2. A single test to create a column in one format and request conversion to the other. 3. A user to convert the format of their existing tables for compatibility.
Additionally I would suggest a new sysvar to control the rounding behavior of the same temporal types, e.g. temporal_microsecond_rounding={round|truncate} which controls whether to round (as Oracle MySQL 5.6 does) or to truncate (as MariaDB does) when faced with fitting a larger number of decimals in a value into a smaller number decimals in a field. (This of course assumes that the current MariaDB behavior and divergence from Oracle MySQL behavior is intentional, and not a bug. Even if it is considered a bug it could probably not be changed outright at this point and the sysvar would still be needed.)
Any thoughts on either of these? We would of course be willing to do the work and provide the patches for them. I just wanted to get your general feelings on the ideas.
New user variables are usually painful for replication. It would be nice to avoid them. I'd propose the following approach: 1. Rounding vs truncation: Understand a new option in SQL_MODE: SET sql_mode=TEMPORAL_ROUNDING; SQL_MODE is already replicated, so no changes in the replication code will be needed. 2. MySQL vs MariaDB data types: Understand some syntax like this: CREATE TABLE t1 (a /*M!50501 MYSQL*/ TIME(6)); CREATE TABLE t1 (a /*M!50501 MARIA*/ TIME(6)); so it's possible to make a self-containing CREATE statement. The default type on the master side should be MariaDB. The default type of the slave side can check whether the master is MariaDB or MySQL (I think it should be known), and choose the corresponding data type.
Regards,
Jeremy
_______________________________________________ Mailing list: https://launchpad.net/~maria-developers Post to : maria-developers@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-developers More help : https://help.launchpad.net/ListHelp
Hi, Jeremy! On Nov 11, Jeremy Cole wrote:
I would propose adding a new sysvar to control the temporal time format e.g. temporal_microsecond_format={mysql|mariadb} which controls whether newly created columns are created as Field_*_hires or Field_*f. This would especially allow:
1. Tests to be run in both Oracle MySQL and MariaDB modes. 2. A single test to create a column in one format and request conversion to the other. 3. A user to convert the format of their existing tables for compatibility.
Agree.
Additionally I would suggest a new sysvar to control the rounding behavior of the same temporal types, e.g. temporal_microsecond_rounding={round|truncate} which controls whether to round (as Oracle MySQL 5.6 does) or to truncate (as MariaDB does) when faced with fitting a larger number of decimals in a value into a smaller number decimals in a field. (This of course assumes that the current MariaDB behavior and divergence from Oracle MySQL behavior is intentional, and not a bug. Even if it is considered a bug it could probably not be changed outright at this point and the sysvar would still be needed.)
Perhaps, yes. When this feature was discussed in MySQL, while I was still there, we all were in agreement about truncating, not rounding. With the reasoning - to avoid drastic date changes, like, when a year changes when you store 1999-12-31 23:59:59.9999 in DATETIME(1) column. I don't know why Oracle has changed that and their arguments, we in MariaDB still thought that these drastic date changes would be an undesirable gotcha, and we've decided to truncate. So, not a bug.
Any thoughts on either of these? We would of course be willing to do the work and provide the patches for them. I just wanted to get your general feelings on the ideas.
Bar made a point about sysvars being difficult for replication. I'd say, it depends on your use case. If it's important to run CREATE TABLE on the master and have a temporal column to be created on the slave using exactly the same binary on-disk format (especially, if one expects to change this sysvar often) - then sysvar changes must be replicated, and it would need special support from the replication code. Otherwise - if the above is not the goal - then sysvar is a good and clean approach, agree. Regards, Sergei
Sergei,
Additionally I would suggest a new sysvar to control the rounding behavior of the same temporal types, e.g. temporal_microsecond_rounding={round|truncate} which controls whether to round (as Oracle MySQL 5.6 does) or to truncate (as MariaDB does) when faced with fitting a larger number of decimals in a value into a smaller number decimals in a field. (This of course assumes that the current MariaDB behavior and divergence from Oracle MySQL behavior is intentional, and not a bug. Even if it is considered a bug it could probably not be changed outright at this point and the sysvar would still be needed.)
Perhaps, yes. When this feature was discussed in MySQL, while I was still there, we all were in agreement about truncating, not rounding. With the reasoning - to avoid drastic date changes, like, when a year changes when you store 1999-12-31 23:59:59.9999 in DATETIME(1) column.
I don't know why Oracle has changed that and their arguments, we in MariaDB still thought that these drastic date changes would be an undesirable gotcha, and we've decided to truncate. So, not a bug.
I guess it doesn't matter why they decided to differ, but the fact that they did means that something has to be done in MariaDB. For instance, if replicating from MySQL 5.6 to MariaDB, the rounding differences will potentially cause completely different values to be stored for the same query, causing data drift between master and slave.
Any thoughts on either of these? We would of course be willing to do
the work and provide the patches for them. I just wanted to get your general feelings on the ideas.
Bar made a point about sysvars being difficult for replication.
I'd say, it depends on your use case. If it's important to run CREATE TABLE on the master and have a temporal column to be created on the slave using exactly the same binary on-disk format (especially, if one expects to change this sysvar often) - then sysvar changes must be replicated, and it would need special support from the replication code.
Otherwise - if the above is not the goal - then sysvar is a good and clean approach, agree.
I think there are two cases here with different goals perhaps: 1. The on-disk storage format
Wow, I accidentally sent that (Command-Enter boo). Ignore that; real reply coming shortly. On Wed, Nov 13, 2013 at 9:39 AM, Jeremy Cole <jeremycole@google.com> wrote:
Sergei,
Additionally I would suggest a new sysvar to control the rounding behavior of the same temporal types, e.g. temporal_microsecond_rounding={round|truncate} which controls whether to round (as Oracle MySQL 5.6 does) or to truncate (as MariaDB does) when faced with fitting a larger number of decimals in a value into a smaller number decimals in a field. (This of course assumes that the current MariaDB behavior and divergence from Oracle MySQL behavior is intentional, and not a bug. Even if it is considered a bug it could probably not be changed outright at this point and the sysvar would still be needed.)
Perhaps, yes. When this feature was discussed in MySQL, while I was still there, we all were in agreement about truncating, not rounding. With the reasoning - to avoid drastic date changes, like, when a year changes when you store 1999-12-31 23:59:59.9999 in DATETIME(1) column.
I don't know why Oracle has changed that and their arguments, we in MariaDB still thought that these drastic date changes would be an undesirable gotcha, and we've decided to truncate. So, not a bug.
I guess it doesn't matter why they decided to differ, but the fact that they did means that something has to be done in MariaDB. For instance, if replicating from MySQL 5.6 to MariaDB, the rounding differences will potentially cause completely different values to be stored for the same query, causing data drift between master and slave.
Any thoughts on either of these? We would of course be willing to do
the work and provide the patches for them. I just wanted to get your general feelings on the ideas.
Bar made a point about sysvars being difficult for replication.
I'd say, it depends on your use case. If it's important to run CREATE TABLE on the master and have a temporal column to be created on the slave using exactly the same binary on-disk format (especially, if one expects to change this sysvar often) - then sysvar changes must be replicated, and it would need special support from the replication code.
Otherwise - if the above is not the goal - then sysvar is a good and clean approach, agree.
I think there are two cases here with different goals perhaps:
1. The on-disk storage format
Sergei,
Additionally I would suggest a new sysvar to control the rounding behavior of the same temporal types, e.g. temporal_microsecond_rounding={round|truncate} which controls whether to round (as Oracle MySQL 5.6 does) or to truncate (as MariaDB does) when faced with fitting a larger number of decimals in a value into a smaller number decimals in a field. (This of course assumes that the current MariaDB behavior and divergence from Oracle MySQL behavior is intentional, and not a bug. Even if it is considered a bug it could probably not be changed outright at this point and the sysvar would still be needed.)
Perhaps, yes. When this feature was discussed in MySQL, while I was
still there, we all were in agreement about truncating, not rounding.
With the reasoning - to avoid drastic date changes, like, when a year
changes when you store 1999-12-31 23:59:59.9999 in DATETIME(1) column.
I don't know why Oracle has changed that and their arguments, we in
MariaDB still thought that these drastic date changes would be an undesirable gotcha, and we've decided to truncate. So, not a bug.
I guess it doesn't matter why they decided to differ, but the fact that they did means that something has to be done in MariaDB. For instance, if replicating from MySQL 5.6 to MariaDB, the rounding differences will potentially cause completely different values to be stored for the same query, causing data drift between master and slave. I don't think that I agree about "drastic changes" being bad though -- they aren't drastic except that humans are unnecessarily attached to specific dates and it "seems" worse to round up and cause all the fields to roll over. But it really is no difference. If you were presented with the UNIX epoch timestamp 946713599.9999 and asked how to round it, you would not hesitate to round it up, despite it being exactly the same date you provided before. And in fact MariaDB does this for decimal fields, so this exists as an inconsistency *internally* within MariaDB: substring_index(version(), "-", 1): 5.6.14 cast(123.999999 as decimal(10,3)): 124.000 cast(-123.999999 as decimal(10,3)): -124.000 cast("01:02:03.999999" as time(3)): 01:02:04.000 cast("-01:02:03.999999" as time(3)): -01:02:04.000 vs. substring_index(version(), "-", 1): 10.0.4 cast(123.999999 as decimal(10,3)): 124.000 cast(-123.999999 as decimal(10,3)): -124.000 cast("01:02:03.999999" as time(3)): 01:02:03.999 cast("-01:02:03.999999" as time(3)): -01:02:03.999 IMHO, that is unexpected.
Any thoughts on either of these? We would of course be willing to do
the work and provide the patches for them. I just wanted to get your general feelings on the ideas.
Bar made a point about sysvars being difficult for replication.
I'd say, it depends on your use case. If it's important to run CREATE
TABLE on the master and have a temporal column to be created on the
slave using exactly the same binary on-disk format (especially, if one
expects to change this sysvar often) - then sysvar changes must be
replicated, and it would need special support from the replication code.
Otherwise - if the above is not the goal - then sysvar is a good and clean approach, agree.
I think there are two cases here with different goals perhaps: 1. The on-disk storage format This is really in the purview of the DBA, not the user. I don't think using sql_mode or allowing the user to choose is appropriate. The DBA must decide whether they want compatibility with older MariaDB or compatibility with newer MySQL, and the user should not be able to compromise that. Additionally, we should not follow the master's choice here, as divergence in this is a valid (and maybe desired) upgrade path -- so I think replicating this is unnecessary. I would opt for a global + session sysvar. 2. The rounding behavior This is unfortunate and not easy to solve; I think it is in both the DBA and the user's purview in two different and slightly conflicting ways: - For replication, the slave should always follow the master's behavior, to avoid data drift. That would mean detecting when connected to MySQL and automatically switching that behavior on. Once this is configurable though, it may differ when replicating MariaDB-MariaDB as well, so then it must also be handled on a per-query basis, which probably requires a sql_mode or similar mechanism. - For user connections, IMHO it could be handled either way, and if an sql_mode would be best that is probably okay. However the DBA should decide the default behavior for them, and they may change it in their session if they like. Overall I would like to get to a place where compatibility with MySQL behavior could be maintained and forced by the DBA without the user having to know or care about the compatibility differences we've brought on ourselves. Regards, Jeremy
Hi, Jeremy! On Nov 13, Jeremy Cole wrote:
Bar made a point about sysvars being difficult for replication.
I'd say, it depends on your use case. If it's important to run CREATE TABLE on the master and have a temporal column to be created on the slave using exactly the same binary on-disk format (especially, if one expects to change this sysvar often) - then sysvar changes must be replicated, and it would need special support from the replication code.
Otherwise - if the above is not the goal - then sysvar is a good and clean approach, agree.
I think there are two cases here with different goals perhaps:
1. The on-disk storage format
This is really in the purview of the DBA, not the user. I don't think using sql_mode or allowing the user to choose is appropriate. The DBA must decide whether they want compatibility with older MariaDB or compatibility with newer MySQL, and the user should not be able to compromise that. Additionally, we should not follow the master's choice here, as divergence in this is a valid (and maybe desired) upgrade path -- so I think replicating this is unnecessary. I would opt for a global + session sysvar.
Okay, then sysvar is good.
2. The rounding behavior
This is unfortunate and not easy to solve; I think it is in both the DBA and the user's purview in two different and slightly conflicting ways:
On a second thought, I'd go with a sysvar too. It wouldn't be the only sysvar that affects the results, and must be replicated. For example, @@div_precision_increment is also one of those, if the master and slave have is set differently, they'll start to diverge. So, I'd rather have a separate solution that covers replication and all such sysvars, instead of trying to put everything into sql_mode. Regards, Sergei
participants (3)
-
Alexander Barkov
-
Jeremy Cole
-
Sergei Golubchik