Re: [PATCH] MDEV-28413 System versioning is incorrect in Spider

Hi Aleksey, Thanks for the patch. Please see below for some initial review comments. There are some todos for myself noted below which I will look into tomorrow.
Basic functioning of system versioning in Spider: UPDATE, DELETE, DELETE HISTORY, AS OF, FROM .. TO, BETWEEN .. AND.
More testing should be done for: LOAD DATA, joins, derived tables, multi-update/delete and what else?
[... 149 lines elided]
For init the spider plugin in spider tests we commonly do --disable_query_log --disable_result_log --source ../t/test_init.inc --enable_result_log --enable_query_log
Use evalp instead of eval and you can save the --replace_result line
I tried executing directly using select spider_direct_sql('set timestamp= unix_timestamp("2000-01-01 00:00:00");', '', 'srv "s"'); But then I get extra rows in subsequent select queries and I think it is caused by the timezones. It could be a separate task to fix this, but we should warn users about it.
+insert into t1_sp values (1, 1), (2, 2);
It may be obvious to you, but I find the timestamps at times hard to keep track of, so I suggest adding a comment to each DML statement indicating the time, as well as the select row_start / row_end statements. For example, change the line above to insert into t1_sp values (1, 1), (2, 2); # T1
+insert into t2_sp values (1, 1), (2, 2);
and change this line to insert into t2_sp values (1, 1), (2, 2); # T2
add # T3
+--sleep 0.01 +delete from t2_sp where x = 1;
# T4
And so we can annotate this line with the timestamp above, like so: select row_end into @t1 from t2 for system_time all where x = 1; # @t1 = T4
+ +update t1_sp set y= y + 1;
# T5
+select row_start into @t2 from t1 where x = 2;
# @t2 = T5
+--sleep 0.01 +update t2_sp set y= y + 1;
# T6 With the annotations I suggested, results from the following block of selects becomes easier to understand.
This is fine. The Spider group-by-handler (GBH) is not meant to handle every situation. Either remove this block starting from the --echo, or make it more succinct by replacing it with something like (Note that GBH supports certain functions like trim): # The spider group-by-handler does not support the check_row function
I will check why the execution without the group-by-handler results in wrong results.
Shouldn't the test name be spider.versioning?
For deinit the spider plugin in spider tests we commonly do --disable_query_log --disable_result_log --source ../t/test_deinit.inc --enable_result_log --enable_query_log Please replace the lines from "uninstall plugin spider;" to the end of the test with this block.
Why remove the #ifndef and #endif?
[... 99 lines elided]
Best, Yuchen

Hi Aleksey, Please see below for further review comments. On Wed 2024-10-02 17:44:36 +1000, Yuchen Pei wrote:
[... 90 lines elided]
I tried executing directly using
select spider_direct_sql('set timestamp= unix_timestamp("2000-01-01 00:00:00");', '', 'srv "s"');
I mean warn users about it in the spider section of the knowledge base.
[... 66 lines elided]
# The spider group-by-handler does not support the check_row function
Scratch that. I did some debugging and understand that the check_rows function is actually an UDF, so this can be fixed by issuing set spider_use_pushdown_udf=1; before issuing the select statements calling check_row(). Please update the test with this and remove the comment block starting with the --echo # spider cannot call functions...
I will check why the execution without the group-by-handler results in wrong results.
I checked this, and it is caused by a missing "for system_time all" in the spider query construction when the sql layer calls rnd_next() on the handler: --8<---------------cut here---------------start------------->8--- 5 in spider_mbase_handler::append_minimum_select_part of /home/ycp/source/mariadb-server/10.5/src/storage/spider/spd_db_mysql.cc:8949 6 in ha_spider::append_minimum_select_sql_part of /home/ycp/source/mariadb-server/10.5/src/storage/spider/ha_spider.cc:10231 7 in spider_db_append_select_columns of /home/ycp/source/mariadb-server/10.5/src/storage/spider/spd_db_conn.cc:1366 8 in ha_spider::rnd_next_internal of /home/ycp/source/mariadb-server/10.5/src/storage/spider/ha_spider.cc:5161 9 in ha_spider::rnd_next of /home/ycp/source/mariadb-server/10.5/src/storage/spider/ha_spider.cc:5298 10 in handler::ha_rnd_next of /home/ycp/source/mariadb-server/10.5/src/sql/handler.cc:3187 11 in find_all_keys of /home/ycp/source/mariadb-server/10.5/src/sql/filesort.cc:904 12 in filesort of /home/ycp/source/mariadb-server/10.5/src/sql/filesort.cc:352 13 in create_sort_index of /home/ycp/source/mariadb-server/10.5/src/sql/sql_select.cc:24709 14 in st_join_table::sort_table of /home/ycp/source/mariadb-server/10.5/src/sql/sql_select.cc:22314 15 in join_init_read_record of /home/ycp/source/mariadb-server/10.5/src/sql/sql_select.cc:22253 16 in sub_select of /home/ycp/source/mariadb-server/10.5/src/sql/sql_select.cc:21307 --8<---------------cut here---------------end--------------->8--- A quick fix could be: --8<---------------cut here---------------start------------->8--- modified storage/spider/spd_db_mysql.cc @@ -7713,6 +7713,12 @@ int spider_mbase_handler::append_table_name_with_adjusting( } else { error_num = mysql_share->append_table_name_with_adjusting(str, spider->conn_link_idx[link_idx]); + const auto& vers_conditions= + spider->get_table()->pos_in_table_list->vers_conditions; + if (spider->wide_handler->sql_command == SQLCOM_SELECT && + (vers_conditions.type == SYSTEM_TIME_ALL || + vers_conditions.orig_type == SYSTEM_TIME_ALL)) + str->append(STRING_WITH_LEN(" for system_time all ")); } DBUG_RETURN(error_num); } --8<---------------cut here---------------end--------------->8--- It looks a bit ad-hoc, but for the purpose of this patch it may be sufficient. One should also check for potential segv in spider or wide_handler. The condition spider->wide_handler->sql_command == SQLCOM_SELECT is needed because the DELETE statement delete from t1_sp where x = 1 also has vers_conditions.type == SYSTEM_TIME_ALL (why?)
[... 98 lines elided]
Why remove the #ifndef and #endif?
I see it is because table is used the next line which I missed. My apologies - please disregard this comment.
[... 4 lines elided]
BTW, the part of the patch mentioning MDEV-16546 such as in spider_mbase_handler::append_insert_values(), spider_mbase_handler::append_into(), and the set timestamp... issue mentioned in my previous email could be part of one separate task that fix the timestamp timezone issue. Best, Yuchen
participants (1)
-
Yuchen Pei