developers
Threads by month
- ----- 2025 -----
- January
- ----- 2024 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2023 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2022 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2021 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2020 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2019 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2018 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2017 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2016 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2015 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2014 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2013 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2012 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2011 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2010 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2009 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
May 2011
- 20 participants
- 22 discussions
09 Jun '11
Moin,
I did some tmp-table tests. As it is told MariaDB is faster on tmp-tables because of the Aria-Engine.
Therefore Ive got 3 Severs running:
onyx:/data/dbod # for i in 6051 6053 6060; do mysql -u erkan -h 10.255.128.4 -P $i -e 'select version()'; done
+------------+
| version() |
+------------+
| 5.5.12-log |
+------------+
+-------------------+
| version() |
+-------------------+
| 5.2.6-MariaDB-log |
+-------------------+
+------------+
| version() |
+------------+
| 5.1.57-log |
key_buffer_size is 16MB. For MariaDB there is also:
mysql> show global variables like 'aria_pagecache_buffer_size';
+----------------------------+----------+
| Variable_name | Value |
+----------------------------+----------+
| aria_pagecache_buffer_size | 16777216 |
+----------------------------+----------+
To provoke tmp-tables I did:
onyx:/data/dbod # for i in 6051 6053 6060; do mysql -u erkan -h 10.255.128.4 -P $i -e 'show global variables like "tmp_table_size"'; done
+----------------+-------+
| Variable_name | Value |
+----------------+-------+
| tmp_table_size | 1024 |
+----------------+-------+
+----------------+-------+
| Variable_name | Value |
+----------------+-------+
| tmp_table_size | 1024 |
+----------------+-------+
+----------------+-------+
| Variable_name | Value |
+----------------+-------+
| tmp_table_size | 1024 |
+----------------+-------+
We got two tables:
mysql> desc sort_id;
+-------+---------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------+---------+------+-----+---------+-------+
| id | int(11) | YES | | NULL | |
+-------+---------+------+-----+---------+-------+
1 row in set (0.00 sec)
mysql> desc sort_1;
+-------+---------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------+---------+------+-----+---------+-------+
| id | int(11) | YES | | NULL | |
| tea | text | YES | | NULL | |
+-------+---------+------+-----+---------+-------+
2 rows in set (0.00 sec)
With 10000 Rows each.
Tests:
onyx:/var/tmp # for i in 6051 6053 6060; do mysqlslap -u erkan -P $i -h 10.255.128.4 --create-schema=sbtest -q "SELECT * from sbtest.sort_1 group by id;" --number-of-queries=20 -i 3; done
Benchmark [[5.5.12]]
Average number of seconds to run all queries: 1.803 seconds
Minimum number of seconds to run all queries: 1.791 seconds
Maximum number of seconds to run all queries: 1.826 seconds
Number of clients running queries: 1
Average number of queries per client: 20
Benchmark [[MariaDB]]
Average number of seconds to run all queries: 5.547 seconds
Minimum number of seconds to run all queries: 5.542 seconds
Maximum number of seconds to run all queries: 5.556 seconds
Number of clients running queries: 1
Average number of queries per client: 20
Benchmark [[5.1.57]]
Average number of seconds to run all queries: 1.605 seconds
Minimum number of seconds to run all queries: 1.603 seconds
Maximum number of seconds to run all queries: 1.606 seconds
Number of clients running queries: 1
Average number of queries per client: 20
onyx:/var/tmp # for i in 6051 6053 6060; do mysqlslap -u erkan -P $i -h 10.255.128.4 --create-schema=sbtest -q "SELECT * from sbtest.sort_id group by id;" --number-of-queries=20 -i 3; done
Benchmark [[5.5.12]]
Average number of seconds to run all queries: 1.146 seconds
Minimum number of seconds to run all queries: 1.135 seconds
Maximum number of seconds to run all queries: 1.170 seconds
Number of clients running queries: 1
Average number of queries per client: 20
Benchmark [[MariaDB]]
Average number of seconds to run all queries: 4.359 seconds
Minimum number of seconds to run all queries: 4.352 seconds
Maximum number of seconds to run all queries: 4.366 seconds
Number of clients running queries: 1
Average number of queries per client: 20
Benchmark [[5.1.57]]
Average number of seconds to run all queries: 1.049 seconds
Minimum number of seconds to run all queries: 1.032 seconds
Maximum number of seconds to run all queries: 1.068 seconds
Number of clients running queries: 1
Average number of queries per client: 20
So MariaDb doesn't perform in this test. So I wonder I did something wrong ...
Regards
Erkan
--
über den grenzen muß die freiheit wohl wolkenlos sein
3
2
Hello,
This is to inform you that today I've pushed MWL#90 into 5.3-main. There was a
buildbot run on 5.3-subqueries-mwl90 tree before the push, and it did not show
any failures that weren't also present in 5.3-main.
The result of my merge has an issue that conceptually it is not a full merge,
because MWL#90 code is not able yet to take advantage of the possibily of
dynamic choice between IN->EXISTS (which was introduced in MWL#89 which Timour
pushed two weeks ago).
Adding support for MWL#89 and MWL#90 working together seems easy. Actually, I
initially intended to do it while merging, but then decided to do one thing at
a time.
My further plans are:
- look at the open bugs I have
- look at making MWL#89 and MWL#90 work together.
BR
Sergey
--
Sergey Petrunia, Software Developer
Monty Program AB, http://askmonty.org
Blog: http://s.petrunia.net/blog
2
1
I want to count errors by database user for IS.user_statistics. I have code
that does it in my_message_sql but hfisk thinks the counts might be
inflated. I don't understand the error reporting framework in MySQL so it is
possible that the counts are wrong or count errors and warnings rather than
only counting errors. The code below is from my_message_sql and shows where
I try to maintain the count. The assignment (thd->is_slave_error= 1) makes
me think that an error has occurred by the time I count it. But code that
follows has comments about clearing an error or converting an error to a
warning.
if ((thd= current_thd))
{
/*
TODO: There are two exceptions mechanism (THD and sp_rcontext),
this could be improved by having a common stack of handlers.
*/
if (thd->handle_error(error, str,
MYSQL_ERROR::WARN_LEVEL_ERROR))
DBUG_RETURN(0);
{
USER_STATS *us= thd_get_user_stats(thd);
--------------------------------> added by me
my_atomic_add_bigint(&(us->errors_total), 1);
--------------------------------> added by me
}
thd->is_slave_error= 1; // needed to catch query errors during
replication
/*
thd->lex->current_select == 0 if lex structure is not inited
(not query command (COM_QUERY))
*/
if (thd->lex->current_select &&
thd->lex->current_select->no_error && !thd->is_fatal_error)
{
DBUG_PRINT("error",
("Error converted to warning: current_select: no_error %d
"
"fatal_error: %d",
(thd->lex->current_select ?
thd->lex->current_select->no_error : 0),
(int) thd->is_fatal_error));
}
else
{
if (! thd->main_da.is_error()) // Return only first message
{
thd->main_da.set_error_status(thd, error, str);
}
query_cache_abort(&thd->net);
}
/*
If a continue handler is found, the error message will be cleared
by the stored procedures code.
*/
if (thd->spcont &&
! (MyFlags & ME_NO_SP_HANDLER) &&
thd->spcont->handle_error(error, MYSQL_ERROR::WARN_LEVEL_ERROR,
thd))
{
/*
Do not push any warnings, a handled error must be completely
silenced.
*/
DBUG_RETURN(0);
}
/* When simulating OOM, skip writing to error log to avoid mtr errors */
DBUG_EXECUTE_IF("simulate_out_of_memory", DBUG_RETURN(0););
if (!thd->no_warnings_for_error &&
!(MyFlags & ME_NO_WARNING_FOR_ERROR))
{
/*
Suppress infinite recursion if there a memory allocation error
inside push_warning.
*/
thd->no_warnings_for_error= TRUE;
push_warning(thd, MYSQL_ERROR::WARN_LEVEL_ERROR, error, str);
thd->no_warnings_for_error= FALSE;
}
}
/* When simulating OOM, skip writing to error log to avoid mtr errors */
DBUG_EXECUTE_IF("simulate_out_of_memory", DBUG_RETURN(0););
if (!thd || MyFlags & ME_NOREFRESH)
sql_print_error("%s: %s",my_progname,str); /* purecov: inspected */
DBUG_RETURN(0);
}
--
Mark Callaghan
mdcallag(a)gmail.com
2
1
Hi Timour,
While merging, I also got the following questions:
== exclude_expensive_cond ==
make_cond_for_table()'s exclude_expensive_cond parameter is not used anymore.
Was it intentional that you kept it in the code?
(minimizing MWL#89's diff size could not be a reason because the diff changes
make_cond_for_table signature anyway)
== exec_const_cond ==
The patch introduces JOIN::exec_const_cond. On the other hand, the code
already had JOIN::outer_ref_cond which seems to be nearly the same.
== join_tab_idx ==
make_join_select() attaches parts of WHERE/ON clauses to JOIN_TABs. It does
so with a help of two functions:
1. make_cond_for_table()
2. make_cond_after_sjm()
The first one does make set_join_tab_idx() calls, the second one doesn't. Why?
== join_tab_idx #2 ===
> @@ -7150,24 +7174,27 @@
> there inside the triggers.
> */
> { // Check const tables
> + join->exec_const_cond=
> + make_cond_for_table(thd, cond,
> join->const_table_map,
> - (table_map) 0, TRUE, FALSE);
> + (table_map) 0, MAX_TABLES, FALSE, FALSE);
As far as I understand the code, the MAX_TABLES part informs the constant
subquery predicate that it will be evaluated O(join_output_records) times,
which is not true.
This is not the only such case, I see plenty of make_cond_for_table() calls
which use MAX_TABLES for join_tab_idx argument, where we can actually tell for
certain that the condition will be evaluated much fewer than
O(join_output_records) times.
BR
Sergey
--
Sergey Petrunia, Software Developer
Monty Program AB, http://askmonty.org
Blog: http://s.petrunia.net/blog
2
1
Re: [Maria-developers] [Commits] Rev 2983: MWL#89 - Automatic merge with 5.3 in file:///home/tsk/mprog/src/5.3-mwl89-merge/
by Sergey Petrunya 23 May '11
by Sergey Petrunya 23 May '11
23 May '11
Hi Timour,
Congratulations on having pushed MWL#89 into 5.3-main! This is quite an
achievement which really moves us forward.
I've been studying the new code and has got some feedback which I thought
I need to share. Please find the notes below, marked with 'psergey'.
> diff -urN --exclude='.*' 5.3-timours-wl-clean/sql/filesort.cc 5.3-timours-wl/sql/filesort.cc
> --- 5.3-timours-wl-clean/sql/filesort.cc 2011-05-11 20:12:50.000000000 +0100
> +++ 5.3-timours-wl/sql/filesort.cc 2011-05-11 20:08:50.000000000 +0100
> @@ -612,10 +612,34 @@
> }
> DBUG_RETURN(HA_POS_ERROR); /* purecov: inspected */
> }
> +
> + bool write_record= false;
> if (error == 0)
> + {
> param->examined_rows++;
> -
> - if (error == 0 && (!select || select->skip_record(thd) > 0))
> + if (select && select->cond)
> + {
> + /*
> + If the condition 'select->cond' contains a subquery, restore the
> + original read/write sets of the table 'sort_form' because when
> + SQL_SELECT::skip_record evaluates this condition. it may include a
psergey:
- The last sentence doesn't parse, please fix.
- Why is the check done *for each record we enumerate*? Please move it outside
of the loop.
- I suspect that now, with Sanja's subquery code, each subquery has a list of
references it makes to outside, so this shortcut is not necessary (let's
discuss that on the next optimizer call)
> + correlated subquery predicate, such that some field in the subquery
> + refers to 'sort_form'.
> + */
> + if (select->cond->with_subselect)
> + sort_form->column_bitmaps_set(save_read_set, save_write_set,
> + save_vcol_set);
> + write_record= (select->skip_record(thd) > 0);
> + if (select->cond->with_subselect)
> + sort_form->column_bitmaps_set(&sort_form->tmp_set,
> + &sort_form->tmp_set,
> + &sort_form->tmp_set);
> + }
> + else
> + write_record= true;
> + }
> +
> + if (write_record)
> {
> if (idx == param->keys)
> {
> diff -urN --exclude='.*' 5.3-timours-wl-clean/sql/item_cmpfunc.cc 5.3-timours-wl/sql/item_cmpfunc.cc
> --- 5.3-timours-wl-clean/sql/item_cmpfunc.cc 2011-05-11 20:12:50.000000000 +0100
> +++ 5.3-timours-wl/sql/item_cmpfunc.cc 2011-05-11 20:08:50.000000000 +0100
> @@ -2072,6 +2072,18 @@
> }
>
>
> +bool Item_in_optimizer::is_expensive_processor(uchar *arg)
> +{
> + return args[1]->is_expensive_processor(arg);
> +}
> +
> +
> +bool Item_in_optimizer::is_expensive()
> +{
> + return args[1]->is_expensive();
> +}
> +
psergey: what about the cases when args[0] is expensive? It should either be
checked, or a comment here is needed with explanation why we don't need to check it.
> longlong Item_func_eq::val_int()
> {
> DBUG_ASSERT(fixed == 1);
> diff -urN --exclude='.*' 5.3-timours-wl-clean/sql/item.h 5.3-timours-wl/sql/item.h
> --- 5.3-timours-wl-clean/sql/item.h 2011-05-11 20:12:50.000000000 +0100
> +++ 5.3-timours-wl/sql/item.h 2011-05-11 20:08:50.000000000 +0100
> @@ -510,7 +521,7 @@
> SUBSELECT_ITEM, ROW_ITEM, CACHE_ITEM, TYPE_HOLDER,
> PARAM_ITEM, TRIGGER_FIELD_ITEM, DECIMAL_ITEM,
> XPATH_NODESET, XPATH_NODESET_CMP,
> - VIEW_FIXER_ITEM, EXPR_CACHE_ITEM};
> + VIEW_FIXER_ITEM, EXPR_CACHE_ITEM, UNKNOWN_ITEM};
psergey: When I grep for UNKNOWN_ITEM, I find only this occurence. Why add it?
>
> enum cond_result { COND_UNDEF,COND_OK,COND_TRUE,COND_FALSE };
>
...
> diff -urN --exclude='.*' 5.3-timours-wl-clean/sql/item_subselect.cc 5.3-timours-wl/sql/item_subselect.cc
> --- 5.3-timours-wl-clean/sql/item_subselect.cc 2011-05-11 20:12:50.000000000 +0100
> +++ 5.3-timours-wl/sql/item_subselect.cc 2011-05-11 20:08:50.000000000 +0100
...
> @@ -1659,49 +1696,40 @@
> {
> if (!(new_having= new Item_func_trig_cond(new_having,
> get_cond_guard(0))))
> - DBUG_RETURN(RES_ERROR);
> + DBUG_RETURN(true);
> }
> - new_having->name= (char*)in_having_cond;
> - select_lex->having= join->having= new_having;
> - select_lex->having_fix_field= 1;
> -
> - /*
> - we do not check join->having->fixed, because comparison function
> - (from func->create) can't be fixed after creation
> - */
> - tmp= join->having->fix_fields(thd, 0);
> - select_lex->having_fix_field= 0;
> - if (tmp)
> - DBUG_RETURN(RES_ERROR);
> +
> + new_having->name= (char*) in_having_cond;
> + if (fix_having(new_having, select_lex))
> + DBUG_RETURN(true);
> + *having_item= new_having;
> }
> else
> - {
> - // it is single select without tables => possible optimization
> - // remove the dependence mark since the item is moved to upper
> - // select and is not outer anymore.
> - item->walk(&Item::remove_dependence_processor, 0,
> - (uchar *) select_lex->outer_select());
> - item= func->create(left_expr, item);
> - // fix_field of item will be done in time of substituting
> - substitution= item;
> - have_to_be_excluded= 1;
> - if (thd->lex->describe)
> - {
> - char warn_buff[MYSQL_ERRMSG_SIZE];
> - sprintf(warn_buff, ER(ER_SELECT_REDUCED), select_lex->select_number);
> - push_warning(thd, MYSQL_ERROR::WARN_LEVEL_NOTE,
> - ER_SELECT_REDUCED, warn_buff);
> - }
> - DBUG_RETURN(RES_REDUCE);
> - }
> + DBUG_ASSERT(FALSE);
psergey: (one)
> }
> }
>
> - DBUG_RETURN(RES_OK);
> + DBUG_RETURN(false);
psergey: (two). I think using both 'false' and 'FALSE' in the same patch is
too much. Please change to TRUE/FALSE since we have it everywhere else.
> }
>
>
> -Item_subselect::trans_res
> +/**
> + Wrap a multi-column IN/ALL/ANY subselect into an Item_in_optimizer.
> +
> + @param join Join object of the subquery (i.e. 'child' join).
> +
> + @details
> + The subquery predicate is wrapped into an Item_in_optimizer. Later the query
> + optimization phase chooses whether the subquery under the Item_in_optimizer
> + will be further transformed into an equivalent correlated EXISTS by injecting
> + additional predicates, or will be executed via subquery materialization in its
> + unmodified form.
> +
> + @retval false The subquery was transformed
> + @retval true Error
> +*/
> +
> +bool
> Item_in_subselect::row_value_transformer(JOIN *join)
> {
> SELECT_LEX *select_lex= join->select_lex;
...
> diff -urN --exclude='.*' 5.3-timours-wl-clean/sql/opt_subselect.cc 5.3-timours-wl/sql/opt_subselect.cc
> --- 5.3-timours-wl-clean/sql/opt_subselect.cc 2011-05-11 20:12:50.000000000 +0100
> +++ 5.3-timours-wl/sql/opt_subselect.cc 2011-05-11 20:08:50.000000000 +0100
> @@ -175,63 +189,80 @@
> else
> {
> DBUG_PRINT("info", ("Subquery can't be converted to semi-join"));
> - /*
> - Check if the subquery predicate can be executed via materialization.
> - The required conditions are:
> - 1. Subquery predicate is an IN/=ANY subq predicate
> - 2. Subquery is a single SELECT (not a UNION)
> - 3. Subquery is not a table-less query. In this case there is no
> - point in materializing.
> - 3A The upper query is not a table-less SELECT ... FROM DUAL. We
> + /* Test if the user has set a legal combination of optimizer switches. */
> + if (!optimizer_flag(thd, OPTIMIZER_SWITCH_IN_TO_EXISTS) &&
> + !optimizer_flag(thd, OPTIMIZER_SWITCH_MATERIALIZATION))
> + my_error(ER_ILLEGAL_SUBQUERY_OPTIMIZER_SWITCHES, MYF(0));
> +
> + if (in_subs)
> + {
> + /* Subquery predicate is an IN/=ANY predicate. */
> + if (optimizer_flag(thd, OPTIMIZER_SWITCH_IN_TO_EXISTS))
> + in_subs->in_strategy|= SUBS_IN_TO_EXISTS;
> + if (optimizer_flag(thd, OPTIMIZER_SWITCH_MATERIALIZATION))
> + in_subs->in_strategy|= SUBS_MATERIALIZATION;
> +
> + /*
> + Check if the subquery predicate can be executed via materialization.
> + The required conditions are:
> + 1. Subquery is a single SELECT (not a UNION)
> + 2. Subquery is not a table-less query. In this case there is no
> + point in materializing.
> + 2A The upper query is not a table-less SELECT ... FROM DUAL. We
> can't do materialization for SELECT .. FROM DUAL because it
> does not call setup_subquery_materialization(). We could make
> SELECT ... FROM DUAL call that function but that doesn't seem
> to be the case that is worth handling.
> - 4. Either the subquery predicate is a top-level predicate, or at
> - least one partial match strategy is enabled. If no partial match
> - strategy is enabled, then materialization cannot be used for
> - non-top-level queries because it cannot handle NULLs correctly.
> - 5. Subquery is non-correlated
> - TODO:
> - This is an overly restrictive condition. It can be extended to:
> - (Subquery is non-correlated ||
> - Subquery is correlated to any query outer to IN predicate ||
> - (Subquery is correlated to the immediate outer query &&
> - Subquery !contains {GROUP BY, ORDER BY [LIMIT],
> - aggregate functions}) && subquery predicate is not under "NOT IN"))
> - 6. No execution method was already chosen (by a prepared statement).
> -
> - (*) The subquery must be part of a SELECT statement. The current
> - condition also excludes multi-table update statements.
> -
> - Determine whether we will perform subquery materialization before
> - calling the IN=>EXISTS transformation, so that we know whether to
> - perform the whole transformation or only that part of it which wraps
> - Item_in_subselect in an Item_in_optimizer.
> - */
> - if (optimizer_flag(thd, OPTIMIZER_SWITCH_MATERIALIZATION) &&
> - in_subs && // 1
> - !select_lex->is_part_of_union() && // 2
> - select_lex->master_unit()->first_select()->leaf_tables && // 3
> - thd->lex->sql_command == SQLCOM_SELECT && // *
> - select_lex->outer_select()->leaf_tables && // 3A
> - subquery_types_allow_materialization(in_subs) &&
> - // psergey-todo: duplicated_subselect_card_check: where it's done?
> - (in_subs->is_top_level_item() ||
> - optimizer_flag(thd, OPTIMIZER_SWITCH_PARTIAL_MATCH_ROWID_MERGE) ||
> - optimizer_flag(thd, OPTIMIZER_SWITCH_PARTIAL_MATCH_TABLE_SCAN)) &&//4
> - !in_subs->is_correlated && // 5
> - in_subs->exec_method == Item_in_subselect::NOT_TRANSFORMED) // 6
> - {
> - in_subs->exec_method= Item_in_subselect::MATERIALIZATION;
> - }
> -
> - Item_subselect::trans_res trans_res;
> - if ((trans_res= subselect->select_transformer(join)) !=
> - Item_subselect::RES_OK)
> - {
> - DBUG_RETURN((trans_res == Item_subselect::RES_ERROR));
> + 3. Either the subquery predicate is a top-level predicate, or at
> + least one partial match strategy is enabled. If no partial match
> + strategy is enabled, then materialization cannot be used for
> + non-top-level queries because it cannot handle NULLs correctly.
> + 4. Subquery is non-correlated
> + TODO:
> + This is an overly restrictive condition. It can be extended to:
> + (Subquery is non-correlated ||
> + Subquery is correlated to any query outer to IN predicate ||
> + (Subquery is correlated to the immediate outer query &&
> + Subquery !contains {GROUP BY, ORDER BY [LIMIT],
> + aggregate functions}) && subquery predicate is not under "NOT IN"))
> +
> + (*) The subquery must be part of a SELECT statement. The current
> + condition also excludes multi-table update statements.
> + */
> + if (!(in_subs->in_strategy & SUBS_MATERIALIZATION &&
> + !select_lex->is_part_of_union() && // 1
> + parent_unit->first_select()->leaf_tables && // 2
> + thd->lex->sql_command == SQLCOM_SELECT && // *
> + select_lex->outer_select()->leaf_tables && // 2A
> + subquery_types_allow_materialization(in_subs) &&
> + // psergey-todo: duplicated_subselect_card_check: where it's done?
> + (in_subs->is_top_level_item() || //3
> + optimizer_flag(thd,
> + OPTIMIZER_SWITCH_PARTIAL_MATCH_ROWID_MERGE) || //3
> + optimizer_flag(thd,
> + OPTIMIZER_SWITCH_PARTIAL_MATCH_TABLE_SCAN)) && //3
> + !in_subs->is_correlated)) //4
> + {
> + /* Materialization is not possible based on syntactic properties. */
> + in_subs->in_strategy&= ~SUBS_MATERIALIZATION;
> + }
> +
> + if (!in_subs->in_strategy)
> + {
> + /*
> + If neither materialization is possible, nor the user chose
> + IN-TO-EXISTS, choose IN-TO-EXISTS as the only universal strategy.
> + */
> + in_subs->in_strategy|= SUBS_IN_TO_EXISTS;
> + }
> }
psergey: The above seems unneccessarily convoluted. This is not apprent from
the patch, but the resulting code looks like this:
if (in_subs)
{
/* Subquery predicate is an IN/=ANY predicate. */
if (optimizer_flag(thd, OPTIMIZER_SWITCH_IN_TO_EXISTS))
in_subs->in_strategy|= SUBS_IN_TO_EXISTS;
if (optimizer_flag(thd, OPTIMIZER_SWITCH_MATERIALIZATION))
in_subs->in_strategy|= SUBS_MATERIALIZATION;
/* a long comment */
if (!(long condition))
{
/* Materialization is not possible based on syntactic properties. */
in_subs->in_strategy&= ~SUBS_MATERIALIZATION;
}
...
Is there any particular reason to first set SUBS_MATERIALIZATION flag, and
then optionally clear it? It seems it would be more straightforward to only
set it when materialization is really applicable?
> +
> + /*
> + Transform each subquery predicate according to its overloaded
> + transformer.
> + */
> + if (subselect->select_transformer(join))
> + DBUG_RETURN(-11);
> }
> }
> DBUG_RETURN(0);
...
> @@ -1830,15 +1919,15 @@
> - sj_inner_fanout*sj_outer_fanout lookups.
>
> */
> - double one_lookup_cost;
> - if (sj_outer_fanout*temptable_rec_size >
> - join->thd->variables.max_heap_table_size)
> - one_lookup_cost= DISK_TEMPTABLE_LOOKUP_COST;
> - else
> - one_lookup_cost= HEAP_TEMPTABLE_LOOKUP_COST;
> + double one_lookup_cost= get_tmp_table_lookup_cost(join->thd,
> + sj_outer_fanout,
> + temptable_rec_size);
> + double one_write_cost= get_tmp_table_write_cost(join->thd,
> + sj_outer_fanout,
> + temptable_rec_size);
>
> double write_cost= join->positions[first_tab].prefix_record_count*
> - sj_outer_fanout * one_lookup_cost;
> + sj_outer_fanout * one_write_cost;
> double full_lookup_cost= join->positions[first_tab].prefix_record_count*
> sj_outer_fanout* sj_inner_fanout *
> one_lookup_cost;
> @@ -3360,9 +3449,23 @@
> JOIN_TAB* join_tab=join->join_tab;
> SELECT_LEX_UNIT *unit= join->unit;
> DBUG_ENTER("rewrite_to_index_subquery_engine");
> +
> /*
> is this simple IN subquery?
> */
> + /* TODO: In order to use these more efficient subquery engines in more cases,
> + the following problems need to be solved:
> + - the code that removes GROUP BY (group_list), also adds an ORDER BY
> + (order), thus GROUP BY queries (almost?) never pass through this branch.
> + Solution: remove the test below '!join->order', because we remove the
> + ORDER clase for subqueries anyway.
> + - in order to set a more efficient engine, the optimizer needs to both
> + decide to remove GROUP BY, *and* select one of the JT_[EQ_]REF[_OR_NULL]
> + access methods, *and* loose scan should be more expensive or
> + inapliccable. When is that possible?
> + - Consider expanding the applicability of this rewrite for loose scan
> + for group by queries.
> + */
> if (!join->group_list && !join->order &&
> join->unit->item &&
> join->unit->item->substype() == Item_subselect::IN_SUBS &&
> @@ -3503,3 +3606,349 @@
> }
>
>
> +/**
> + Optimize all subqueries of a query that have were flattened into a semijoin.
psergey: a typo on the above line?
> +
> + @details
> + Optimize all immediate children subqueries of a query.
> +
> + This phase must be called after substitute_for_best_equal_field() because
> + that function may replace items with other items from a multiple equality,
> + and we need to reference the correct items in the index access method of the
> + IN predicate.
> +
> + @return Operation status
> + @retval FALSE success.
> + @retval TRUE error occurred.
> +*/
> +
> +bool JOIN::optimize_unflattened_subqueries()
> +{
> + return select_lex->optimize_unflattened_subqueries();
> +}
> +
> +
> +/**
> + Choose an optimal strategy to execute an IN/ALL/ANY subquery predicate
> + based on cost.
> +
> + @param join_tables the set of tables joined in the subquery
> +
> + @notes
> + The method chooses between the materialization and IN=>EXISTS rewrite
> + strategies for the execution of a non-flattened subquery IN predicate.
> + The cost-based decision is made as follows:
> +
> + 1. compute materialize_strategy_cost based on the unmodified subquery
> + 2. reoptimize the subquery taking into account the IN-EXISTS predicates
> + 3. compute in_exists_strategy_cost based on the reoptimized plan
> + 4. compare and set the cheaper strategy
> + if (materialize_strategy_cost >= in_exists_strategy_cost)
> + in_strategy = MATERIALIZATION
> + else
> + in_strategy = IN_TO_EXISTS
> + 5. if in_strategy = MATERIALIZATION and it is not possible to initialize it
> + revert to IN_TO_EXISTS
> + 6. if (in_strategy == MATERIALIZATION)
> + revert the subquery plan to the original one before reoptimizing
> + else
> + inject the IN=>EXISTS predicates into the new EXISTS subquery plan
> +
> + The implementation itself is a bit more complicated because it takes into
> + account two more factors:
> + - whether the user allowed both strategies through an optimizer_switch, and
> + - if materialization was the cheaper strategy, whether it can be executed
> + or not.
> +
> + @retval FALSE success.
> + @retval TRUE error occurred.
> +*/
> +
> +bool JOIN::choose_subquery_plan(table_map join_tables)
> +{
> + Query_plan_state save_qep; /* The original QEP of the subquery. */
> + enum_reopt_result reopt_result= REOPT_NONE;
> + Item_in_subselect *in_subs;
> +
> + if (select_lex->master_unit()->item &&
> + select_lex->master_unit()->item->is_in_predicate())
> + {
> + in_subs= (Item_in_subselect*) select_lex->master_unit()->item;
> + if (in_subs->create_in_to_exists_cond(this))
> + return true;
> + }
> + else
> + return false;
> +
> + DBUG_ASSERT(in_subs->in_strategy); /* A strategy must be chosen earlier. */
> + DBUG_ASSERT(in_to_exists_where || in_to_exists_having);
> + DBUG_ASSERT(!in_to_exists_where || in_to_exists_where->fixed);
> + DBUG_ASSERT(!in_to_exists_having || in_to_exists_having->fixed);
> +
> + /*
> + Compute and compare the costs of materialization and in-exists if both
> + strategies are possible and allowed by the user (checked during the prepare
> + phase.
> + */
> + if (in_subs->in_strategy & SUBS_MATERIALIZATION &&
> + in_subs->in_strategy & SUBS_IN_TO_EXISTS)
psergey: when I write conditions like the one above, Monty tells me to change
it to
in_subs->in_strategy & (SUBS_MATERIALIZATION |SUBS_IN_TO_EXISTS)
Did he not tell the same to you?
> + {
> + JOIN *outer_join;
> + JOIN *inner_join= this;
> + /* Number of (partial) rows of the outer JOIN filtered by the IN predicate. */
> + double outer_record_count;
> + /* Number of unique value combinations filtered by the IN predicate. */
> + double outer_lookup_keys;
> + /* Cost and row count of the unmodified subquery. */
> + double inner_read_time_1, inner_record_count_1;
> + /* Cost of the subquery with injected IN-EXISTS predicates. */
> + double inner_read_time_2;
> + /* The cost to compute IN via materialization. */
> + double materialize_strategy_cost;
> + /* The cost of the IN->EXISTS strategy. */
> + double in_exists_strategy_cost;
> + double dummy;
> +
> + /*
> + A. Estimate the number of rows of the outer table that will be filtered
> + by the IN predicate.
> + */
> + outer_join= unit->outer_select() ? unit->outer_select()->join : NULL;
> + if (outer_join)
> + {
> + uint outer_partial_plan_len;
> + /*
> + Make_cond_for_table is called for predicates only in the WHERE/ON
> + clauses. In all other cases, predicates are not pushed to any
> + JOIN_TAB, and their joi_tab_idx remains MAX_TABLES. Such predicates
> + are evaluated for each complete row of the outer join.
> + */
> + outer_partial_plan_len= (in_subs->get_join_tab_idx() == MAX_TABLES) ?
> + outer_join->tables :
> + in_subs->get_join_tab_idx() + 1;
> + outer_join->get_partial_join_cost(outer_partial_plan_len, &dummy,
> + &outer_record_count);
> + if (outer_join->tables > outer_join->const_tables)
> + outer_lookup_keys= prev_record_reads(outer_join->best_positions,
> + outer_partial_plan_len,
> + in_subs->used_tables());
> + else
> + {
> + /* If all tables are constant, positions is undefined. */
> + outer_lookup_keys= 1;
> + }
> + }
> + else
> + {
> + /*
> + TODO: outer_join can be NULL for DELETE statements.
> + How to compute its cost?
> + */
> + outer_record_count= 1;
> + outer_lookup_keys=1;
> + }
> + /*
> + There cannot be more lookup keys than the total number of records.
> + TODO: this a temporary solution until we find a better way to compute
> + get_partial_join_cost() and prev_record_reads() in a consitent manner,
> + where it is guaranteed that (outer_lookup_keys <= outer_record_count).
> + */
> + if (outer_lookup_keys > outer_record_count)
> + outer_lookup_keys= outer_record_count;
> +
> + /*
> + B. Estimate the cost and number of records of the subquery both
> + unmodified, and with injected IN->EXISTS predicates.
> + */
> + inner_read_time_1= inner_join->best_read;
> + inner_record_count_1= inner_join->record_count;
> +
> + if (in_to_exists_where && const_tables != tables)
> + {
> + /*
> + Re-optimize and cost the subquery taking into account the IN-EXISTS
> + conditions.
> + */
> + reopt_result= reoptimize(in_to_exists_where, join_tables, &save_qep);
> + if (reopt_result == REOPT_ERROR)
> + return TRUE;
> +
> + /* Get the cost of the modified IN-EXISTS plan. */
> + inner_read_time_2= inner_join->best_read;
> +
> + }
> + else
> + {
> + /* Reoptimization would not produce any better plan. */
> + inner_read_time_2= inner_read_time_1;
> + }
> +
> + /*
> + C. Compute execution costs.
> + */
> + /* C.1 Compute the cost of the materialization strategy. */
> + uint rowlen= get_tmp_table_rec_length(unit->first_select()->item_list);
> + /* The cost of writing one row into the temporary table. */
> + double write_cost= get_tmp_table_write_cost(thd, inner_record_count_1,
> + rowlen);
> + /* The cost of a lookup into the unique index of the materialized table. */
> + double lookup_cost= get_tmp_table_lookup_cost(thd, inner_record_count_1,
> + rowlen);
> + /*
> + The cost of executing the subquery and storing its result in an indexed
> + temporary table.
> + */
> + double materialization_cost= inner_read_time_1 +
> + write_cost * inner_record_count_1;
> +
> + materialize_strategy_cost= materialization_cost +
> + outer_record_count * lookup_cost;
> +
> + /* C.2 Compute the cost of the IN=>EXISTS strategy. */
> + in_exists_strategy_cost= outer_lookup_keys * inner_read_time_2;
> +
> + /* C.3 Compare the costs and choose the cheaper strategy. */
> + if (materialize_strategy_cost >= in_exists_strategy_cost)
> + in_subs->in_strategy&= ~SUBS_MATERIALIZATION;
> + else
> + in_subs->in_strategy&= ~SUBS_IN_TO_EXISTS;
> + }
> +
> + /*
> + If (1) materialization is a possible strategy based on semantic analysis
> + during the prepare phase, then if
> + (2) it is more expensive than the IN->EXISTS transformation, and
> + (3) it is not possible to create usable indexes for the materialization
> + strategy,
> + fall back to IN->EXISTS.
> + otherwise
> + use materialization.
> + */
> + if (in_subs->in_strategy & SUBS_MATERIALIZATION &&
> + in_subs->setup_mat_engine())
> + {
> + /*
> + If materialization was the cheaper or the only user-selected strategy,
> + but it is not possible to execute it due to limitations in the
> + implementation, fall back to IN-TO-EXISTS.
> + */
> + in_subs->in_strategy&= ~SUBS_MATERIALIZATION;
> + in_subs->in_strategy|= SUBS_IN_TO_EXISTS;
> + }
> +
> + if (in_subs->in_strategy & SUBS_MATERIALIZATION)
> + {
> + /* Restore the original query plan used for materialization. */
> + if (reopt_result == REOPT_NEW_PLAN)
> + restore_query_plan(&save_qep);
> +
> + /* TODO: should we set/unset this flag for both select_lex and its unit? */
> + in_subs->unit->uncacheable&= ~UNCACHEABLE_DEPENDENT;
> + select_lex->uncacheable&= ~UNCACHEABLE_DEPENDENT;
> +
> + /*
> + Reset the "LIMIT 1" set in Item_exists_subselect::fix_length_and_dec.
> + TODO:
> + Currently we set the subquery LIMIT to infinity, and this is correct
> + because we forbid at parse time LIMIT inside IN subqueries (see
> + Item_in_subselect::test_limit). However, once we allow this, here
> + we should set the correct limit if given in the query.
> + */
> + in_subs->unit->global_parameters->select_limit= NULL;
> + in_subs->unit->set_limit(unit->global_parameters);
> + /*
> + Set the limit of this JOIN object as well, because normally its being
> + set in the beginning of JOIN::optimize, which was already done.
> + */
> + select_limit= in_subs->unit->select_limit_cnt;
> + }
> + else if (in_subs->in_strategy & SUBS_IN_TO_EXISTS)
> + {
> + if (reopt_result == REOPT_NONE && in_to_exists_where &&
> + const_tables != tables)
> + {
> + /*
> + The subquery was not reoptimized either because the user allowed only the
> + IN-EXISTS strategy, or because materialization was not possible based on
> + semantic analysis. Clenup the original plan and reoptimize.
> + */
> + for (uint i= 0; i < tables; i++)
> + {
> + join_tab[i].keyuse= NULL;
> + join_tab[i].checked_keys.clear_all();
> + }
> + if ((reopt_result= reoptimize(in_to_exists_where, join_tables, NULL)) ==
> + REOPT_ERROR)
> + return TRUE;
> + }
> +
> + if (in_subs->inject_in_to_exists_cond(this))
> + return TRUE;
> + }
> + else
> + DBUG_ASSERT(FALSE);
> +
> + return FALSE;
> +}
> +
> +
> +/**
> + Choose a query plan for a table-less subquery.
> +
> + @notes
> +
> + @retval FALSE success.
> + @retval TRUE error occurred.
> +*/
> +
> +bool JOIN::choose_tableless_subquery_plan()
> +{
> + DBUG_ASSERT(!tables_list || !tables);
> + if (select_lex->master_unit()->item)
> + {
> + DBUG_ASSERT(select_lex->master_unit()->item->type() ==
> + Item::SUBSELECT_ITEM);
> + Item_subselect *subs_predicate= select_lex->master_unit()->item;
> +
> + /*
> + If the optimizer determined that his query has an empty result,
> + in most cases the subquery predicate is a known constant value -
> + either FALSE or NULL. The implementation of Item_subselect::reset()
> + determines which one.
> + */
> + if (zero_result_cause)
> + {
> + if (!implicit_grouping)
> + {
> + /*
> + Both group by queries and non-group by queries without aggregate
> + functions produce empty subquery result.
> + */
> + subs_predicate->reset();
> + subs_predicate->make_const();
> + return FALSE;
> + }
> +
> + /* TODO:
> + A further optimization is possible when a non-group query with
> + MIN/MAX/COUNT is optimized by opt_sum_query. Then, if there are
> + only MIN/MAX functions over an empty result set, the subquery
> + result is a NULL value/row, thus the value of subs_predicate is
> + NULL.
> + */
> + }
> +
> + if (subs_predicate->is_in_predicate())
> + {
> + Item_in_subselect *in_subs;
> + in_subs= (Item_in_subselect*) subs_predicate;
> + in_subs->in_strategy= SUBS_IN_TO_EXISTS;
> + if (in_subs->create_in_to_exists_cond(this) ||
> + in_subs->inject_in_to_exists_cond(this))
> + return TRUE;
> + tmp_having= having;
> + }
> + }
> + return FALSE;
> +}
> +
> diff -urN --exclude='.*' 5.3-timours-wl-clean/sql/sql_lex.cc 5.3-timours-wl/sql/sql_lex.cc
> --- 5.3-timours-wl-clean/sql/sql_lex.cc 2011-05-11 20:12:50.000000000 +0100
> +++ 5.3-timours-wl/sql/sql_lex.cc 2011-05-11 20:08:50.000000000 +0100
> @@ -1684,6 +1684,31 @@
> slave= 0;
> }
>
> +
> +void st_select_lex_node::add_slave(st_select_lex_node *slave_arg)
> +{
> + for (; slave; slave= slave->next)
> + if (slave == slave_arg)
> + return;
> +
> + if (slave)
> + {
> + st_select_lex_node *slave_arg_slave= slave_arg->slave;
> + /* Insert in the front of list of slaves if any. */
> + slave_arg->include_neighbour(slave);
> + /* include_neighbour() sets slave_arg->slave=0, restore it. */
> + slave_arg->slave= slave_arg_slave;
> + /* Count on include_neighbour() setting the master. */
> + DBUG_ASSERT(slave_arg->master == this);
> + }
> + else
> + {
> + slave= slave_arg;
> + slave_arg->master= this;
> + }
> +}
psergey:
1. Suppose slave_arg is already in the list of slaves (based on the function's
code, this seems to be possible) Why does this function adjust this->slave in
this case?
2. I've made the following change:
=== modified file 'sql/sql_lex.cc'
--- sql/sql_lex.cc 2011-05-05 12:24:28 +0000
+++ sql/sql_lex.cc 2011-05-13 21:34:49 +0000
@@ -1693,6 +1693,7 @@ void st_select_lex_node::add_slave(st_se
if (slave)
{
+ DBUG_ASSERT(0);
st_select_lex_node *slave_arg_slave= slave_arg->slave;
/* Insert in the front of list of slaves if any. */
slave_arg->include_neighbour(slave);
and all t/subselect*.test still passed. I think the code inside if () { ... }
is deadcode. If yes, please remove it, if not, please add coverage to the
testsuite.
> +
> +
> /*
> include on level down (but do not link)
>
...
> diff -urN --exclude='.*' 5.3-timours-wl-clean/sql/sql_list.h 5.3-timours-wl/sql/sql_list.h
> --- 5.3-timours-wl-clean/sql/sql_list.h 2011-05-11 20:12:50.000000000 +0100
> +++ 5.3-timours-wl/sql/sql_list.h 2011-05-11 20:08:50.000000000 +0100
> @@ -260,7 +260,7 @@
> list_node *node= first;
> list_node *list_first= list->first;
> elements=0;
> - while (node && node != list_first)
> + while (node->info && node != list_first)
psergey:
Why the above change? Does this mean that List<T> now can contain (T*)NULL ?
> {
> prev= &node->next;
> node= node->next;
> diff -urN --exclude='.*' 5.3-timours-wl-clean/sql/sql_select.cc 5.3-timours-wl/sql/sql_select.cc
> --- 5.3-timours-wl-clean/sql/sql_select.cc 2011-05-11 20:12:50.000000000 +0100
> +++ 5.3-timours-wl/sql/sql_select.cc 2011-05-11 20:08:50.000000000 +0100
...
> @@ -882,6 +887,7 @@
> "Impossible HAVING" : "Impossible WHERE";
> tables= 0;
> error= 0;
> + choose_tableless_subquery_plan();
> goto setup_subq_exit;
> }
psergey: there are three instances of this piece of code:
if (.. we've figured out this is a degenerate join which produces no rows ...)
{
...
choose_tableless_subquery_plan();
goto setup_subq_exit;
}
I'm afraid if somebody invents a fourth way to figure out that the join is
degenerate, they will forget to make call to choose_tableless_subquery_plan().
Could you prevent this by introducing
empty_join_output:
choose_tableless_subquery_plan();
... whatever else ...
setup_subq_exit:
and using appropriate goto's?
> }
> @@ -926,12 +932,13 @@
> */
> if ((res=opt_sum_query(select_lex->leaf_tables, all_fields, conds)))
> {
> - if (res == HA_ERR_KEY_NOT_FOUND)
> + if (res == HA_ERR_KEY_NOT_FOUND || res < 0)
psergey: What is this for? I am looking at opt_sum_query code and I don't
see any cases where it will return a negative value?
> {
> DBUG_PRINT("info",("No matching min/max row"));
> zero_result_cause= "No matching min/max row";
> tables= 0;
> error=0;
> + choose_tableless_subquery_plan();
> goto setup_subq_exit;
> }
> if (res > 1)
...
> @@ -9036,11 +9070,16 @@
> *simple_order=0; // Must do a temp table to sort
> else if (!(order_tables & not_const_tables))
> {
> - if (order->item[0]->with_subselect &&
> - !(join->select_lex->options & SELECT_DESCRIBE))
> - order->item[0]->val_str(&order->item[0]->str_value);
> + if (order->item[0]->with_subselect)
> + {
> + /*
> + Delay the evaluation of constant ORDER and/or GROUP expressions that
> + contain subqueries until the execution phase.
> + */
> + join->exec_const_order_group_cond.push_back(order->item[0]);
psergey: I'm wondering why would one want to evaluate them at all in that
case?
If we have
SELECT ... GROUP BY const_subquery_expression, ...
we don't need to evaluate const_subquery_expression at all, do we?
> + }
> DBUG_PRINT("info",("removing: %s", order->item[0]->full_name()));
> - continue; // skip const item
> + continue;
> }
> else
> {
...
> @@ -20316,5 +20363,154 @@
>
>
> /**
> + Save a query execution plan so that the caller can revert to it if needed,
> + and reset the current query plan so that it can be reoptimized.
> +
> + @param save_to The object into which the current query plan state is saved
> +*/
> +
> +void JOIN::save_query_plan(Query_plan_state *save_to)
> +{
> + if (keyuse.elements)
> + {
> + DYNAMIC_ARRAY tmp_keyuse;
> + /* Swap the current and the backup keyuse internal arrays. */
> + tmp_keyuse= keyuse;
> + keyuse= save_to->keyuse; /* keyuse is reset to an empty array. */
> + save_to->keyuse= tmp_keyuse;
> +
> + for (uint i= 0; i < tables; i++)
> + {
> + save_to->join_tab_keyuse[i]= join_tab[i].keyuse;
> + join_tab[i].keyuse= NULL;
> + save_to->join_tab_checked_keys[i]= join_tab[i].checked_keys;
> + join_tab[i].checked_keys.clear_all();
> + }
> + }
> + memcpy((uchar*) save_to->best_positions, (uchar*) best_positions,
> + sizeof(POSITION) * (tables + 1));
> + memset(best_positions, 0, sizeof(POSITION) * (tables + 1));
> +}
> +
> +
> +/**
> + Restore a query execution plan previously saved by the caller.
> +
> + @param The object from which the current query plan state is restored.
> +*/
> +
> +void JOIN::restore_query_plan(Query_plan_state *restore_from)
> +{
> + if (restore_from->keyuse.elements)
> + {
> + DYNAMIC_ARRAY tmp_keyuse;
> + tmp_keyuse= keyuse;
> + keyuse= restore_from->keyuse;
> + restore_from->keyuse= tmp_keyuse;
> +
> + for (uint i= 0; i < tables; i++)
> + {
> + join_tab[i].keyuse= restore_from->join_tab_keyuse[i];
> + join_tab[i].checked_keys= restore_from->join_tab_checked_keys[i];
> + }
> +
> + }
> + memcpy((uchar*) best_positions, (uchar*) restore_from->best_positions,
> + sizeof(POSITION) * (tables + 1));
> +}
> +
> +
> +/**
> + Reoptimize a query plan taking into account an additional conjunct to the
> + WHERE clause.
> +
> + @param added_where An extra conjunct to the WHERE clause to reoptimize with
> + @param join_tables The set of tables to reoptimize
> + @param save_to If != NULL, save here the state of the current query plan
> +
> + @notes
> + Given a query plan that was already optimized taking into account some WHERE
> + clause 'C', reoptimize this plan with a new WHERE clause 'C AND added_where'.
> + The reoptimization works as follows:
> +
> + 1. Call update_ref_and_keys *only* for the new conditions 'added_where'
> + that are about to be injected into the query.
> + 2. Expand if necessary the original KEYUSE array JOIN::keyuse to
> + accommodate the new REF accesses computed for the 'added_where' condition.
> + 3. Add the new KEYUSEs into JOIN::keyuse.
> + 4. Re-sort and re-filter the JOIN::keyuse array with the newly added
> + KEYUSE elements.
psergey:
I think the above scheme has some flaws. What if the initial filtering
has discarded elements that would be useful with the new injected condition?
Consider this example:
create table ten (a int);
insert into ten values (0),(1),(2),(3),(4),(5),(6),(7),(8),(9);
create table one_k (a int);
insert into one_k select A.a + B.a*10 + C.a*100 from ten A, ten B, ten C;
create table t11 (kp1 int, kp2 int, a int, filler char(100), key(kp1, kp2));
insert into t11 select a/100, a, a, 'filler' from one_k;
If I debug this query:
explain select a from one_k where a in (select kp1 from t11 where kp2=10 and a=4) or a+1 < 100000;
I see that the optimizer considers access to t11 via lookup on t11.kp1=...
but not on (t11.kp1=... AND t11.kp2=..).
Please note that from user point of view this will look like a regression.
> +
> + @retval REOPT_NEW_PLAN there is a new plan.
> + @retval REOPT_OLD_PLAN no new improved plan was produced, use the old one.
> + @retval REOPT_ERROR an irrecovarable error occured during reoptimization.
> +*/
> +
> +JOIN::enum_reopt_result
> +JOIN::reoptimize(Item *added_where, table_map join_tables,
> + Query_plan_state *save_to)
> +{
psergey:
This function looks very odd. We touch "keyuse" only when
restore_from->keyuse.elements!=NULL
What about the case when
this->keyuse.elements!=NULL && restore_from->keyuse.elements==NULL
?
This looks like a possible scenario to me. Suppose that
- original (i.e. Materialization) plan produces no KEYUSE-s
- IN-to-EXISTS plan does produce KEYUSE
and we're restoring from IN-to-EXISTS to Materialization plan?
> + DYNAMIC_ARRAY added_keyuse;
> + SARGABLE_PARAM *sargables= 0; /* Used only as a dummy parameter. */
> + uint org_keyuse_elements;
> +
> + /* Re-run the REF optimizer to take into account the new conditions. */
> + if (update_ref_and_keys(thd, &added_keyuse, join_tab, tables, added_where,
> + ~outer_join, select_lex, &sargables))
> + {
> + delete_dynamic(&added_keyuse);
> + return REOPT_ERROR;
> + }
> +
> + if (!added_keyuse.elements)
> + {
> + delete_dynamic(&added_keyuse);
> + return REOPT_OLD_PLAN;
> + }
> +
> + if (save_to)
> + save_query_plan(save_to);
> +
> + if (!keyuse.buffer &&
> + my_init_dynamic_array(&keyuse, sizeof(KEYUSE), 20, 64))
> + {
> + delete_dynamic(&added_keyuse);
> + return REOPT_ERROR;
> + }
> +
> + org_keyuse_elements= save_to ? save_to->keyuse.elements : keyuse.elements;
> + allocate_dynamic(&keyuse, org_keyuse_elements + added_keyuse.elements);
> +
> + /* If needed, add the access methods from the original query plan. */
> + if (save_to)
> + {
> + DBUG_ASSERT(!keyuse.elements);
> + memcpy(keyuse.buffer,
> + save_to->keyuse.buffer,
> + (size_t) save_to->keyuse.elements * keyuse.size_of_element);
> + keyuse.elements= save_to->keyuse.elements;
> + }
> +
> + /* Add the new access methods to the keyuse array. */
> + memcpy(keyuse.buffer + keyuse.elements * keyuse.size_of_element,
> + added_keyuse.buffer,
> + (size_t) added_keyuse.elements * added_keyuse.size_of_element);
> + keyuse.elements+= added_keyuse.elements;
> + /* added_keyuse contents is copied, and it is no longer needed. */
> + delete_dynamic(&added_keyuse);
> +
> + if (sort_and_filter_keyuse(&keyuse))
> + return REOPT_ERROR;
> + optimize_keyuse(this, &keyuse);
> +
> + /* Re-run the join optimizer to compute a new query plan. */
> + if (choose_plan(this, join_tables))
> + return REOPT_ERROR;
> +
> + return REOPT_NEW_PLAN;
> +}
> +
> +
> +/**
> @} (end of group Query_Optimizer)
> */
> diff -urN --exclude='.*' 5.3-timours-wl-clean/sql/sql_select.h 5.3-timours-wl/sql/sql_select.h
> --- 5.3-timours-wl-clean/sql/sql_select.h 2011-05-11 20:12:50.000000000 +0100
> +++ 5.3-timours-wl/sql/sql_select.h 2011-05-11 20:08:50.000000000 +0100
> @@ -603,8 +603,53 @@
>
> class JOIN :public Sql_alloc
> {
> +private:
> JOIN(const JOIN &rhs); /**< not implemented */
> JOIN& operator=(const JOIN &rhs); /**< not implemented */
> +
> +protected:
> +
> + /**
> + The subset of the state of a JOIN that represents an optimized query
> + execution plan. Allows saving/restoring different plans for the same query.
> + */
> + class Query_plan_state {
psergey: I have a concern about naming. This is really a join plan state, not
a query plan state. We may get into mess if at some point we'll need to save
plan state for the whole query.
Is it still possible to rename this to something like Join_plan_state?
> + public:
> + DYNAMIC_ARRAY keyuse; /* Copy of the JOIN::keyuse array. */
> + POSITION best_positions[MAX_TABLES+1]; /* Copy of JOIN::best_positions */
> + /* Copies of the JOIN_TAB::keyuse pointers for each JOIN_TAB. */
> + KEYUSE *join_tab_keyuse[MAX_TABLES];
> + /* Copies of JOIN_TAB::checked_keys for each JOIN_TAB. */
> + key_map join_tab_checked_keys[MAX_TABLES];
> + public:
> + Query_plan_state()
> + {
> + keyuse.elements= 0;
> + keyuse.buffer= NULL;
> + }
> + Query_plan_state(JOIN *join);
> + ~Query_plan_state()
> + {
> + delete_dynamic(&keyuse);
> + }
> + };
> +
> + /* Results of reoptimizing a JOIN via JOIN::reoptimize(). */
> + enum enum_reopt_result {
> + REOPT_NEW_PLAN, /* there is a new reoptimized plan */
> + REOPT_OLD_PLAN, /* no new improved plan can be found, use the old one */
> + REOPT_ERROR, /* an irrecovarable error occured during reoptimization */
> + REOPT_NONE /* not yet reoptimized */
> + };
> +
> + /* Support for plan reoptimization with rewritten conditions. */
> + enum_reopt_result reoptimize(Item *added_where, table_map join_tables,
> + Query_plan_state *save_to);
> + void save_query_plan(Query_plan_state *save_to);
> + void restore_query_plan(Query_plan_state *restore_from);
> + /* Choose a subquery plan for a table-less subquery. */
> + bool choose_tableless_subquery_plan();
> +
> public:
> JOIN_TAB *join_tab,**best_ref;
> JOIN_TAB **map2table; ///< mapping between table indexes and JOIN_TABs
> @@ -702,6 +747,13 @@
> account the changes made by test_if_skip_sort_order()).
> */
> double best_read;
> + /*
> + Estimated result rows (fanout) of the whole query. If this is a subquery
psergey: I think it would be better to change "whole query" to "join
operation" or "the select" here.
> + that is reexecuted multiple times, this value includes the estiamted # of
> + reexecutions. This value is equal to the multiplication of all
> + join->positions[i].records_read of a JOIN.
> + */
> + double record_count;
> List<Item> *fields;
> List<Cached_item> group_fields, group_fields_cache;
> TABLE *tmp_table;
..
> @@ -1233,6 +1313,9 @@
> if (!inited)
> {
> inited=1;
psergey: I don't understand why this wasn't needed before MWL#89, but is
needed after MWL#89. Do you still remember why you've added it?
> + TABLE *table= to_field->table;
> + my_bitmap_map *old_map= dbug_tmp_use_all_columns(table,
> + table->write_set);
> if ((res= item->save_in_field(to_field, 1)))
> {
> if (!err)
BR
Sergey
--
Sergey Petrunia, Software Developer
Monty Program AB, http://askmonty.org
Blog: http://s.petrunia.net/blog
2
2
Re: [Maria-developers] [Commits] Rev 3005: Changed MariaDB error numbers to start from 1900 to not conflict with MySQL error numbers in lp:maria/5.3
by Sergei Golubchik 23 May '11
by Sergei Golubchik 23 May '11
23 May '11
Hi, Michael!
There are more MariaDB-only error messages, I think you need to move all
of them.
On May 21, Michael Widenius wrote:
> === modified file 'sql/share/errmsg.txt'
> --- a/sql/share/errmsg.txt 2011-05-18 13:27:19 +0000
> +++ b/sql/share/errmsg.txt 2011-05-20 21:46:18 +0000
> @@ -6214,6 +6214,19 @@ ER_DEBUG_SYNC_HIT_LIMIT
> eng "debug sync point hit limit reached"
> ger "Debug Sync Point Hit Limit erreicht"
>
> +#
> +# MariaDB error messages section starts here
> +#
> +
> +# The following is here to allow us to detect if there was missing
> +# error messages in the errmsg.sys file
> +
> +ER_LAST_MYSQL_ERROR_MESSAGE
> + eng ""
> +
> +# MariaDB error numbers starts from 1900
> +start-error-number 1900
> +
> ER_VCOL_BASED_ON_VCOL
> eng "A computed column cannot be based on a computed column"
Regards,
Sergei
2
1
(
1. Sent this mail few days ago to Monty but no answer yet. I know, he is
busy man.
2. I use name "MySQL" because this is real name for me. Oracle still sucks:P
)
Hi!
We all have seen cases where people try to use SQL for something what
does not fit there. Reason is simple - file system itself is very good
database for some type of objects and queries.
One common MySQL usage scenario is web portal. Often portal has some
sort of news and photos accompanying it. My idea is to make some sort of
clever way to store images within MySQL making their read/write as fast
as possible and meanwhile giving extra value to user who shouldn't care
about data integrity keeping data in and out of SQL coherent. Also would
be cool to get all required data with single query.
I think best way to implement this is to introduce field type "file"
which is stored outside of usual data storage. Only his physical
location on disk is stored in MyISAM or similar. This is exactly same as
people keep file name in CHAR currently but we provide method to fetch
file contents with same query. Also we can implement cool replication to
keep same file on all servers consistent, way to backup things via
mysqldump etc....
Actually idea is bit wider but what you think generally?
Tõnu
3
2
Hi!
Here is finally a review of the microsecond patch.
(Took me 5 days to complete! Lots of code!!!)
I got this by doing
bzr diff -r tag:mysql-5.1.54..-1
in the 5.1-micro tree.
> === modified file 'client/mysqltest.cc'
> --- client/mysqltest.cc 2010-11-22 09:21:10 +0000
> +++ client/mysqltest.cc 2011-02-23 17:26:02 +0000
> @@ -7342,7 +7342,8 @@ int util_query(MYSQL* org_mysql, const c
> cur_con->util_mysql= mysql;
> }
>
> - return mysql_query(mysql, query);
> + int ret= mysql_query(mysql, query);
> + DBUG_RETURN(ret);
> }
Not necessary change, but not important enough to revert.
> === modified file 'include/my_sys.h'
<cut>
> +#define my_micro_time() (my_getsystime()/10)
I am not happy with the above change. The old my_micro_time() was
defined to be significantly faster than getsystime() for some systems
(especially windows and solaris) and we should consider keeping the
old implementation....
<long phone conversation>
Conclusion: my_getsystime() is not depending on the clock and only to
be used when calculating intervals.
> +#define hrtime_to_time(X) ((X).val/1000000)
> +#define hrtime_from_time(X) ((ulonglong)((X)*1000000ULL))
> +#define hrtime_to_double(X) ((X).val/1e6)
I think you need to add a cast to double avoid compiler warnings on
all platforms. (check buildbot!)
<cut>
>
> === modified file 'include/my_time.h'
> --- include/my_time.h 2008-04-03 17:14:57 +0000
> +++ include/my_time.h 2011-03-26 10:59:34 +0000
> @@ -55,6 +55,7 @@ typedef long my_time_t;
> /* Flags to str_to_datetime */
> #define TIME_FUZZY_DATE 1
> #define TIME_DATETIME_ONLY 2
> +#define TIME_TIME_ONLY 4
> /* Must be same as MODE_NO_ZERO_IN_DATE */
> #define TIME_NO_ZERO_IN_DATE (65536L*2*2*2*2*2*2*2)
> /* Must be same as MODE_NO_ZERO_DATE */
> @@ -68,8 +69,9 @@ typedef long my_time_t;
> #define TIME_MAX_HOUR 838
> #define TIME_MAX_MINUTE 59
> #define TIME_MAX_SECOND 59
> +#define TIME_MAX_SECOND_PART 999999
> #define TIME_MAX_VALUE (TIME_MAX_HOUR*10000 + TIME_MAX_MINUTE*100 + \
> - TIME_MAX_SECOND)
> + TIME_MAX_SECOND + TIME_MAX_SECOND_PART/1e6)
Make 1e6 a constant and use this in my_sys.h and here.
> #define TIME_MAX_VALUE_SECONDS (TIME_MAX_HOUR * 3600L + \
> TIME_MAX_MINUTE * 60L + TIME_MAX_SECOND)
>
> @@ -80,16 +82,17 @@ str_to_datetime(const char *str, uint le
> uint flags, int *was_cut);
> longlong number_to_datetime(longlong nr, MYSQL_TIME *time_res,
> uint flags, int *was_cut);
> +int number_to_time(double nr, MYSQL_TIME *ltime, int *was_cut);
> ulonglong TIME_to_ulonglong_datetime(const MYSQL_TIME *);
> ulonglong TIME_to_ulonglong_date(const MYSQL_TIME *);
> ulonglong TIME_to_ulonglong_time(const MYSQL_TIME *);
> ulonglong TIME_to_ulonglong(const MYSQL_TIME *);
> +double TIME_to_double(const MYSQL_TIME *my_time);
>
> +longlong pack_time(MYSQL_TIME *my_time);
> +MYSQL_TIME *unpack_time(longlong packed, MYSQL_TIME *my_time);
>
> -my_bool str_to_time(const char *str,uint length, MYSQL_TIME *l_time,
> - int *warning);
> -
> -int check_time_range(struct st_mysql_time *, int *warning);
> +int check_time_range(struct st_mysql_time *my_time, uint dec, int *warning);
>
> long calc_daynr(uint year,uint month,uint day);
> uint calc_days_in_year(uint year);
> @@ -136,11 +139,30 @@ void set_zero_time(MYSQL_TIME *tm, enum
> sent using binary protocol fit in this buffer.
> */
> #define MAX_DATE_STRING_REP_LENGTH 30
> +#define MAX_SEC_PART_VALUE 999999
> +#define MAX_SEC_PART_DIGITS 6
> +#define AUTO_SEC_PART_DIGITS 31 /* same as NOT_FIXED_DEC */
Do we really need to have MAX_SEC_PART_VALUE and TIME_MAX_SECOND_PART ?
Shouldn't these always be the same for mysys functions?
Also MAX_SEC_PART_DIGITS should be changed to TIME_MAX_SECOND_PART_DIGITS
<cut>
> === modified file 'mysql-test/include/ps_conv.inc'
> @@ -1153,19 +1146,19 @@ select '-- select .. where date/time col
> set @arg00= '1991-01-01 01:01:01' ;
> select 'true' as found from t9
> where c1= 20 and c13= CAST('1991-01-01 01:01:01' AS DATE) and c14= '1991-01-01 01:01:01' and
> - c15= '1991-01-01 01:01:01' and c16= '1991-01-01 01:01:01' and
> + c15= '1991-01-01 01:01:01' and
You need to add at least one test that shows the failure of doing a
comparison between datetime and time.
> === added file 'mysql-test/include/type_hrtime.inc'
> --- mysql-test/include/type_hrtime.inc 1970-01-01 00:00:00 +0000
> +++ mysql-test/include/type_hrtime.inc 2011-03-01 12:24:36 +0000
> @@ -0,0 +1,94 @@
> +
> +--source include/have_innodb.inc
> +
> +--disable_warnings
> +drop table if exists t1, t2, t3;
> +--enable_warnings
> +
> +--error ER_TOO_BIG_PRECISION
> +eval create table t1 (a $type(7));
> +
> +eval create table t1 (a $type(3));
> +insert t1 values ('2010-12-11 01:02:03.4567');
> +insert t1 values (20101211010203.45678);
> +insert t1 values (20101211030405.789e0);
> +insert t1 values (99991231235959e1);
> +select * from t1;
> +select truncate(a, 6) from t1; # Field::val_real()
> +select a DIV 1 from t1; # Field::val_int()
> +select group_concat(distinct a) from t1; # Field::cmp()
> +alter table t1 engine=innodb;
> +select * from t1;
> +drop table t1;
> +eval create table t1 (a $type(4)) engine=innodb;
> +insert t1 values ('2010-12-11 01:02:03.456789');
> +select * from t1;
> +select extract(microsecond from a + interval 100 microsecond) from t1 where a>'2010-11-12 01:02:03.456';
> +select a from t1 where a>'2010-11-12 01:02:03.456' group by a;
Shouldn't we do the above for all engines ?
If you can, please add at test of this in suite/engines/time.test
It would be good to add a combinations file:
[xtradb]
--default-engine=innodb
[aira]
--default-engine=aria
[myisam]
--default-engine=myisam
[heap]
--default-engine=heap
The problem I see with this is:
- How to ignore running tests if an engine doesn't exist?
- How to handle different result files ?
- Adding the combination name to the result file may be a good
temporary solution until we can do something based on 'diff'
<cut>
> +#
> +# Views
> +#
> +
> +create view v1 as select * from t1 group by a,b;
> +select * from v1;
> +show columns from v1;
> +create table t2 select * from v1;
> +show create table t2;
Please also select some data from t2.
> +
> +drop view v1;
> +drop table t1, t2;
Good set of tests. The things that was not tested, as far as I can see:
- INSERT SELECT between tables that has different column types. It
would be good to test all common combinations.
date -> datetime
date -> string
time -> datetime
time -> string
datetime -> time
datetime -> date
datetime -> timestamp
datetime -> string
timestamp -> datetime
timestamp -> date
timestamp -> time
timestamp -> string
string -> date
string -> datetime
string -> time
string -> timestamp
Doing this with different microsecond precission would be also good
And the same with alter table.
> === modified file 'mysql-test/r/cast.result'
> --- mysql-test/r/cast.result 2009-05-21 08:06:43 +0000
> +++ mysql-test/r/cast.result 2011-03-18 13:52:20 +0000
> @@ -254,7 +254,7 @@ cast("2001-1-1" as datetime) = "2001-01-
> 1
> select cast("1:2:3" as TIME) = "1:02:03";
> cast("1:2:3" as TIME) = "1:02:03"
> -0
> +1
> select cast(NULL as DATE);
> cast(NULL as DATE)
> NULL
> @@ -452,3 +452,9 @@ SELECT CONVERT(t2.a USING UTF8) FROM t1,
> 1
> DROP TABLE t1;
> End of 5.1 tests
> +create table t1 (f1 time, f2 date, f3 datetime);
> +insert into t1 values ('11:22:33','2011-12-13','2011-12-13 11:22:33');
> +select cast(f1 as unsigned), cast(f2 as unsigned), cast(f3 as unsigned) from t1;
> +cast(f1 as unsigned) cast(f2 as unsigned) cast(f3 as unsigned)
> +112233 20111213 20111213112233
> +drop table t1;
Please also add a field datetime with microseconds to test cast to
unsigned and double.
> === modified file 'mysql-test/r/date_formats.result'
> --- mysql-test/r/date_formats.result 2010-11-12 10:12:15 +0000
> +++ mysql-test/r/date_formats.result 2011-03-08 18:41:58 +0000
> @@ -196,16 +196,16 @@ date format datetime
> 0003-01-02 8:11:2.123456 %Y-%m-%d %H:%i:%S.%# 0003-01-02 08:11:02
> 03-01-02 8:11:2.123456 %Y-%m-%d %H:%i:%S.%# 2003-01-02 08:11:02
> 2003-01-02 10:11:12 PM %Y-%m-%d %h:%i:%S %p 2003-01-02 22:11:12
> -2003-01-02 01:11:12.12345AM %Y-%m-%d %h:%i:%S.%f%p 2003-01-02 01:11:12.123450
> -2003-01-02 02:11:12.12345AM %Y-%m-%d %h:%i:%S.%f %p 2003-01-02 02:11:12.123450
> -2003-01-02 12:11:12.12345 am %Y-%m-%d %h:%i:%S.%f%p 2003-01-02 00:11:12.123450
> +2003-01-02 01:11:12.12345AM %Y-%m-%d %h:%i:%S.%f%p 2003-01-02 01:11:12
> +2003-01-02 02:11:12.12345AM %Y-%m-%d %h:%i:%S.%f %p 2003-01-02 02:11:12
> +2003-01-02 12:11:12.12345 am %Y-%m-%d %h:%i:%S.%f%p 2003-01-02 00:11:12
Why has microseconds been removed from the output ?
It looks like 'cast(str_to_date(date, format) as datetime)' removes
all microseconds, which is an incompatible change. In the past
cast(..., datetime) as same as cast(..., datetime(6))
You should at least add a new test that uses 'cast(str_to_date(date, format) as datetime(6))'
Please also compile a list for the KB about all incompatible changes.
<cut>
I think that it's more important that cast() produces the same value
as MySQL did than that create ... select cast() produces the same
type as before.
With the first issue, you will get lost data (without any warnings)
for old applictions if they would do things like:
select * from t1 where cast(char_column as time) = "12:23:24.123456";
which works with MySQL but not with this patch.
> === modified file 'mysql-test/r/func_in.result'
> --- mysql-test/r/func_in.result 2010-06-22 18:53:08 +0000
> +++ mysql-test/r/func_in.result 2011-03-01 12:24:36 +0000
> @@ -544,15 +544,9 @@ id select_type table type possible_keys
> select f2 from t2 where f2 in ('a','b');
> f2
> 0
> -Warnings:
> -Warning 1292 Truncated incorrect DOUBLE value: 'a'
> -Warning 1292 Truncated incorrect DOUBLE value: 'b'
> explain select f2 from t2 where f2 in ('a','b');
> id select_type table type possible_keys key key_len ref rows Extra
> 1 SIMPLE t2 index t2f2 t2f2 5 NULL 3 Using where; Using index
> -Warnings:
> -Warning 1292 Truncated incorrect DOUBLE value: 'a'
> -Warning 1292 Truncated incorrect DOUBLE value: 'b'
Good that we got rid of confusing internal error messages!
> === modified file 'mysql-test/r/func_sapdb.result'
> --- mysql-test/r/func_sapdb.result 2010-08-13 13:05:46 +0000
> +++ mysql-test/r/func_sapdb.result 2011-03-08 18:41:58 +0000
> @@ -113,10 +113,10 @@ subtime("01:00:00.999999", "02:00:00.999
> -00:59:59.999999
> select subtime("02:01:01.999999", "01:01:01.999999");
> subtime("02:01:01.999999", "01:01:01.999999")
> -01:00:00.000000
> +01:00:00
Add also a test for:
select subtime("02:01:01.999999", "01:01:01.999998");
To ensure we don't drop the sub seconds.
> === modified file 'mysql-test/r/func_time.result'
> --- mysql-test/r/func_time.result 2010-10-31 16:04:38 +0000
> +++ mysql-test/r/func_time.result 2011-03-19 14:49:36 +0000
> @@ -8,20 +8,20 @@ period_add("9602",-12) period_diff(19950
<cut>
> +1994-03-02 10:11:12 1994-03-02 10:11:12 19940302101112
> select sec_to_time(9001),sec_to_time(9001)+0,time_to_sec("15:12:22"),
> sec_to_time(time_to_sec("0:30:47")/6.21);
> sec_to_time(9001) sec_to_time(9001)+0 time_to_sec("15:12:22") sec_to_time(time_to_sec("0:30:47")/6.21)
> -02:30:01 23001.000000 54742 00:04:57
> +02:30:01 23001 54742 00:04:57.423510
Please add test for 'sec_to_time(9001.1)' and
'time_to_sec("15:12:22.123456')' and 'time_to_sec(15.55)'.
I noticed that:
select time_to_sec(15.55023124); -> 15.55023100
select time_to_sec('15.55023124'); -> 15.550231
Wouldn't this be better to limit the first one also to 6 digits?
(Serg agreed to this on IRC)
Add test for above cases too
> @@ -798,9 +798,7 @@ TIMESTAMPDIFF(year,'2006-01-10 14:30:28'
> 2
> select date_add(time,INTERVAL 1 SECOND) from t1;
> date_add(time,INTERVAL 1 SECOND)
> -NULL
> -Warnings:
> -Warning 1264 Out of range value for column 'time' at row 1
> +06:07:09
Add a note to compatible matrix that 'date_add()' now also works for
time types.
> @@ -941,10 +939,10 @@ sec_to_time(1) + 0, from_unixtime(1) + 0
> show create table t1;
> Table Create Table
> t1 CREATE TABLE `t1` (
> - `now() - now()` double(23,6) NOT NULL DEFAULT '0.000000',
> - `curtime() - curtime()` double(23,6) NOT NULL DEFAULT '0.000000',
> - `sec_to_time(1) + 0` double(23,6) DEFAULT NULL,
> - `from_unixtime(1) + 0` double(23,6) DEFAULT NULL
> + `now() - now()` double(17,0) NOT NULL DEFAULT '0',
> + `curtime() - curtime()` double(17,0) NOT NULL DEFAULT '0',
> + `sec_to_time(1) + 0` double(17,0) DEFAULT NULL,
> + `from_unixtime(1) + 0` double(17,0) DEFAULT NULL
> ) ENGINE=MyISAM DEFAULT CHARSET=latin1
To make example complete, add to the original create time some
decimals for sec_to_time and also some decimals to one of the now() arguments.
The CREATE was:
create table t1 select now() - now(), curtime() - curtime(),
sec_to_time(1) + 0, from_unixtime(1) + 0;
As part of this, I noticed that from_unixtime() and unix_timestamp()
is not enhanced to support microseconds.
This needs to work:
create table t2 (a int, b timestamp(6))
insert into t2 (a) values(1);
select * from t2;
I got '1 | 2011-03-29 15:38:53.395516'
select unix_timestamp(b) from t2; -> 1301402333 (no decimals)
This needs to returns sub seconds!
I tested that:
set timestamp=1301402333.55;
select now(6);
works (good)!
<cut>
> +create table t1 (f1 integer, f2 date);
> +insert into t1 values (1,'2011-05-05'),(2,'2011-05-05'),(3,'2011-05-05'),(4,'2011-05-05'),(5,'2011-05-05');
> +select * from t1 where 1 and concat(f2)=MAKEDATE(2011, 125);
> +f1 f2
> +1 2011-05-05
> +2 2011-05-05
> +3 2011-05-05
> +4 2011-05-05
> +5 2011-05-05
Add also at least one value for which there is no match.
<cut>
> --- mysql-test/r/func_time_hires.result 1970-01-01 00:00:00 +0000
> +++ mysql-test/r/func_time_hires.result 2011-03-17 13:13:03 +0000
> @@ -0,0 +1,177 @@
> +set timestamp=unix_timestamp('2011-01-01 01:01:01') + 0.123456, time_zone='+03:00';
> +select sec_to_time(12345), sec_to_time(12345.6789), sec_to_time(1234567e-2);
> +sec_to_time(12345) 03:25:45
> +sec_to_time(12345.6789) 03:25:45.6789
> +sec_to_time(1234567e-2) 03:25:45.670000
> +select now(), curtime(0), utc_timestamp(1), utc_time(2), current_time(3),
> +current_timestamp(4), localtime(5), localtimestamp(6), time_to_sec('12:34:56'),
> +time_to_sec('12:34:56.789');
> +now() 2011-01-01 01:01:01
> +curtime(0) 01:01:01
> +utc_timestamp(1) 2010-12-31 22:01:01.1
> +utc_time(2) 22:01:01.12
> +current_time(3) 01:01:01.123
> +current_timestamp(4) 2011-01-01 01:01:01.1234
> +localtime(5) 2011-01-01 01:01:01.12345
> +localtimestamp(6) 2011-01-01 01:01:01.123456
> +time_to_sec('12:34:56') 45296
> +time_to_sec('12:34:56.789') 45296.789
> +select sec_to_time(time_to_sec('1:2:3')), sec_to_time(time_to_sec('2:3:4.567890'));
> +sec_to_time(time_to_sec('1:2:3')) 01:02:03
> +sec_to_time(time_to_sec('2:3:4.567890')) 02:03:04.567890
> +select time_to_sec(sec_to_time(11111)), time_to_sec(sec_to_time(11111.22222));
> +time_to_sec(sec_to_time(11111)) 11111
> +time_to_sec(sec_to_time(11111.22222)) 11111.22222
> +select current_timestamp(7);
> +ERROR HY000: Incorrect arguments to now
<cut>
> +select CAST(@a AS DATETIME(7));
> +ERROR 42000: Too big precision 7 specified for column '(@a)'. Maximum is 6.
The above would be a better error message than 'Incorrect arguments to
now'. Can you add precision and maximum to this error message?
> +select time(f1) from t1 union all select time(f1) from t1;
Add a time operation on the second time(f1) to make the rows a bit
difference (and test more code)
> +time(f1)
> +21:00:00.000000
> +21:00:00.000000
> +drop table t1;
> === modified file 'mysql-test/r/information_schema.result'
> --- mysql-test/r/information_schema.result 2010-06-23 16:25:31 +0000
> +++ mysql-test/r/information_schema.result 2011-03-01 12:24:36 +0000
> @@ -189,8 +189,8 @@ Field Type Collation Null Key Default Ex
> c varchar(64) utf8_general_ci NO select,insert,update,references
> select * from information_schema.COLUMNS where table_name="t1"
> and column_name= "a";
> -TABLE_CATALOG TABLE_SCHEMA TABLE_NAME COLUMN_NAME ORDINAL_POSITION COLUMN_DEFAULT IS_NULLABLE DATA_TYPE CHARACTER_MAXIMUM_LENGTH CHARACTER_OCTET_LENGTH NUMERIC_PRECISION NUMERIC_SCALE CHARACTER_SET_NAME COLLATION_NAME COLUMN_TYPE COLUMN_KEY EXTRA PRIVILEGES COLUMN_COMMENT
> -NULL mysqltest t1 a 1 NULL YES int NULL NULL 10 0 NULL NULL int(11) select,insert,update,references
> +TABLE_CATALOG TABLE_SCHEMA TABLE_NAME COLUMN_NAME ORDINAL_POSITION COLUMN_DEFAULT IS_NULLABLE DATA_TYPE CHARACTER_MAXIMUM_LENGTH CHARACTER_OCTET_LENGTH NUMERIC_PRECISION NUMERIC_SCALE DATETIME_PRECISION CHARACTER_SET_NAME COLLATION_NAME COLUMN_TYPE COLUMN_KEY EXTRA PRIVILEGES COLUMN_COMMENT
> +NULL mysqltest t1 a 1 NULL YES int NULL NULL 10 0 NULL NULL NULL int(11) select,insert,update,references
This is also an incompatible change that should be listed.
> === modified file 'mysql-test/r/parser.result'
> --- mysql-test/r/parser.result 2009-04-17 20:00:53 +0000
> +++ mysql-test/r/parser.result 2011-03-01 12:24:36 +0000
> @@ -556,9 +556,13 @@ DROP TABLE IF EXISTS t1;
> SELECT STR_TO_DATE('10:00 PM', '%h:%i %p') + INTERVAL 10 MINUTE;
> STR_TO_DATE('10:00 PM', '%h:%i %p') + INTERVAL 10 MINUTE
> NULL
> +Warnings:
> +Error 1411 Incorrect datetime value: '10:00 PM' for function str_to_date
Strange that 'str_to_date()' doesn't work with TIME arguments, but
'date_add()' does.
> === modified file 'mysql-test/r/ps.result'
> --- mysql-test/r/ps.result 2010-10-04 08:51:26 +0000
> +++ mysql-test/r/ps.result 2011-03-01 12:24:36 +0000
> @@ -3040,3 +3040,18 @@ id select_type table type possible_keys
> DEALLOCATE PREPARE stmt;
> DROP TABLE t1;
> End of 5.1 tests.
> +prepare stmt from "select date('2010-10-10') between '2010-09-09' and ?";
> +set @a='2010-11-11';
> +execute stmt using @a;
> +date('2010-10-10') between '2010-09-09' and ?
> +1
> +execute stmt using @a;
> +date('2010-10-10') between '2010-09-09' and ?
> +1
> +set @a='2010-08-08';
> +execute stmt using @a;
> +date('2010-10-10') between '2010-09-09' and ?
> +0
> +execute stmt using @a;
> +date('2010-10-10') between '2010-09-09' and ?
> +0
Please add a test that uses sub seconds (for completeness)
set @a='2010-08-08 02:03:04.567890';
execute stmt using @a;
> === modified file 'mysql-test/r/select.result'
<cut>
> -select str_to_date('2007-10-09','%Y-%m-%d') <= '2007/10/2000:00:00 GMT-6';
> -str_to_date('2007-10-09','%Y-%m-%d') <= '2007/10/2000:00:00 GMT-6'
> +select str_to_date('2007-10-09','%Y-%m-%d') <= '2007/10/20 00:00:00 GMT-6';
> +str_to_date('2007-10-09','%Y-%m-%d') <= '2007/10/20 00:00:00 GMT-6'
Please add a new test that shows that this fails:
select str_to_date('2007-10-09','%Y-%m-%d') <= '2007/10/2000:00:00 GMT-6'
As this is an incompatible change (but of course not critical)
> Warnings:
> -Warning 1292 Truncated incorrect date value: '2007/10/2000:00:00 GMT-6'
> +Warning 1292 Truncated incorrect date value: '2007/10/20 00:00:00 GMT-6'
> select str_to_date('2007-10-01','%Y-%m-%d') = '2007-10-1 00:00:00 GMT-6';
> str_to_date('2007-10-01','%Y-%m-%d') = '2007-10-1 00:00:00 GMT-6'
> 1
> @@ -4179,22 +4179,17 @@ Warning 1292 Truncated incorrect datetim
> select str_to_date('2007-10-00 12:34','%Y-%m-%d %H:%i') = '2007-10-01 12:34';
> str_to_date('2007-10-00 12:34','%Y-%m-%d %H:%i') = '2007-10-01 12:34'
> 0
> -Warnings:
> -Warning 1292 Truncated incorrect datetime value: '2007-10-00 12:34:00'
Add a test here (which is run with SQL_MODE=traditional):
select str_to_date('2007-10-00', "%Y-%m-%d %H:%i");
(Makes the test easier to understand)
> @@ -4221,18 +4216,18 @@ str_to_date('1','%Y-%m-%d') = '1'
> Warning 1292 Truncated incorrect date value: '1'
> select str_to_date('','%Y-%m-%d') = '';
> str_to_date('','%Y-%m-%d') = ''
This is an incompatible change (but for the better)
In comparison string to date, we don't give errors for zero part dates
independent of SQL_MODE.
And as agreed on IRC, we can't have str_to_date('','%Y-%m-%d') return
NULL as otherwise we couldn't use = to find rows with zero date in a
table which would be worse.
<cut>
> Warnings:
> Warning 1292 Truncated incorrect date value: ''
> select str_to_date('1000-01-01','%Y-%m-%d') between '0000-00-00' and NULL;
> str_to_date('1000-01-01','%Y-%m-%d') between '0000-00-00' and NULL
> -0
> +NULL
> select str_to_date('1000-01-01','%Y-%m-%d') between NULL and '2000-00-00';
> str_to_date('1000-01-01','%Y-%m-%d') between NULL and '2000-00-00'
> -0
> +NULL
> select str_to_date('1000-01-01','%Y-%m-%d') between NULL and NULL;
> str_to_date('1000-01-01','%Y-%m-%d') between NULL and NULL
> -0
> +NULL
For completeness sake add:
select str_to_date('2000-01-02','%Y-%m-%d') between NULL and '2000-01-01';
-> 0
select str_to_date('2000-01-02','%Y-%m-%d') between '2000-01-03' and NULL;
-> 0
MariaDB 5.1 returns wrong result (NULL) for the above while 5.1-micro
returns the correct answer.
> === modified file 'mysql-test/r/subselect.result'
<cut>
> -delete from t1 where b = (select b from t1);
> +delete from t1 where b in (select b from t1);
Why the above change ?
Please also add the original row, so that we can ensure it works the
same way in the future.
> === modified file 'mysql-test/r/type_datetime.result'
> @@ -352,7 +352,7 @@ least(cast('01-01-01' as date), '01-01-0
> 2001-01-01
> select greatest(cast('01-01-01' as date), '01-01-02');
> greatest(cast('01-01-01' as date), '01-01-02')
> -01-01-02
> +2001-01-02
Incompatible change (but ok)
> === modified file 'mysql-test/r/type_time.result'
> --- mysql-test/r/type_time.result 2010-05-31 09:25:11 +0000
> +++ mysql-test/r/type_time.result 2011-03-08 21:01:40 +0000
> @@ -1,6 +1,8 @@
> drop table if exists t1;
> create table t1 (t time);
> insert into t1 values("10:22:33"),("12:34:56.78"),(10),(1234),(123456.78),(1234559.99),("1"),("1:23"),("1:23:45"), ("10.22"), ("-10 1:22:33.45"),("20 10:22:33"),("1999-02-03 20:33:34");
> +Warnings:
> +Note 1265 Data truncated for column 't' at row 13
Why the above note ?
(I checked that the data is ok)
Please document this in the compatible matrix.
> === modified file 'mysql-test/suite/funcs_1/r/is_columns.result'
> --- mysql-test/suite/funcs_1/r/is_columns.result 2008-11-13 09:50:20 +0000
> +++ mysql-test/suite/funcs_1/r/is_columns.result 2011-03-24 14:55:52 +0000
<cut>
> +++ mysql-test/suite/funcs_1/r/is_columns_innodb.result 2011-03-24 14:55:52 +0000
> @@ -382,333 +382,333 @@ LOAD DATA INFILE '<MYSQLTEST_VARDIR>/std
> SELECT * FROM information_schema.columns
> WHERE table_schema LIKE 'test%'
> ORDER BY table_schema, table_name, column_name;
> -TABLE_CATALOG TABLE_SCHEMA TABLE_NAME COLUMN_NAME ORDINAL_POSITION COLUMN_DEFAULT IS_NULLABLE DATA_TYPE CHARACTER_MAXIMUM_LENGTH CHARACTER_OCTET_LENGTH NUMERIC_PRECISION NUMERIC_SCALE CHARACTER_SET_NAME COLLATION_NAME COLUMN_TYPE COLUMN_KEY EXTRA PRIVILEGES COLUMN_COMMENT
> -NULL test t1 f1 1 NULL YES char 20 20 NULL NULL latin1 latin1_swedish_ci char(20) select,insert,update,references
.... lots of changes
How about changing the SELECT * FROM information_schema.columns to
just list the old columns in suite/funcs_1/datadict/columns.inc
This gives a smaller change list and will not cause a diff if we add
another column...
Especially this change caused really huge diff and will likely cause
merge conflicts sooner or later...
<cut>
> === modified file 'mysys/my_getsystime.c'
<cut>
> +/*
> + get time since epoc in 100 nanosec units
> +
> + NOTE:
> + Thus to get the current time we should use the system function
> + with the highest possible resolution
> +
> + The value is not subject to resetting or drifting by way of adjtime() or
> + settimeofday(), and thus it is *NOT* appropriate for getting the current
> + timestamp. It can be used for calculating time intervals, though.
> + And it's good enough for UUID.
> +*/
> +
> ulonglong my_getsystime()
> {
> #ifdef HAVE_CLOCK_GETTIME
> struct timespec tp;
> clock_gettime(CLOCK_REALTIME, &tp);
> return (ulonglong)tp.tv_sec*10000000+(ulonglong)tp.tv_nsec/100;
> +#elif defined(HAVE_GETHRTIME)
> + return gethrtime()/100-gethrtime_offset;
> #elif defined(__WIN__)
> LARGE_INTEGER t_cnt;
> if (query_performance_frequency)
> @@ -57,169 +73,70 @@ ulonglong my_getsystime()
> #endif
> }
As discussed on IRC, I don't think it's make sense to try to have this
aligned to 'epoch'. Better to just remove gethrtime_offset;
Beeing 'free' from epoch will allow us to use other even faster
clock's in the future like clock_gettime(CLOCK_MONOTONIC)
> +my_hrtime_t my_hrtime()
> {
> + my_hrtime_t hrtime;
> #if defined(__WIN__)
> ulonglong newtime;
> GetSystemTimeAsFileTime((FILETIME*)&newtime);
> + hrtime.val= newtime/10;
> #elif defined(HAVE_GETHRTIME)
Shouldn't the above be 'HAVE_GETTIMEOFDAY' ?
> struct timeval t;
> /*
> The following loop is here because gettimeofday may fail on some systems
> */
> while (gettimeofday(&t, NULL) != 0)
> {}
> + hrtime.val= t.tv_sec*1000000 + t.tv_usec;
> +#else
> + hrtime.val= my_getsystime()/10;
Don't think this option is correct as my_getsystime() is defined to be
not reliable for timing.
I would assume that all systems have either GetSystemTimeAsFileTime,
gettimeofday or clock_gettime().
So I would instead add here:
#ifdef HAVE_CLOCK_GETTIME
struct timespec tp;
clock_gettime(CLOCK_REALTIME, &tp);
return (ulonglong)tp.tv_sec*10000000+(ulonglong)tp.tv_nsec/100;
#else
#error Need one of GetSystemTimeAsFileTime(), gettimeofday() or clock_gettime()
#endif
> + return hrtime;
<cut>
> +void my_time_init()
> {
<cut>
> + gethrtime_offset= gethrtime();
I suggest we remove this
> +#endif
> }
> === modified file 'sql-common/my_time.c'
> --- sql-common/my_time.c 2010-07-09 12:00:17 +0000
> +++ sql-common/my_time.c 2011-03-26 10:59:34 +0000
> @@ -19,6 +19,9 @@
> /* Windows version of localtime_r() is declared in my_ptrhead.h */
> #include <my_pthread.h>
>
> +static enum enum_mysql_timestamp_type
> +str_to_time(const char *str, uint length, MYSQL_TIME *l_time, int *warning)
Make this again global; There are places where we can call this
directly and there is no reason to do an extra jump to get here.
<cut>
> @@ -681,24 +692,31 @@ my_bool str_to_time(const char *str, uin
> 1 time value is invalid
> */
>
> +int check_time_range(struct st_mysql_time *my_time, uint dec, int *warning)
> {
> longlong hour;
> + static ulong max_sec_part[MAX_SEC_PART_DIGITS+1]= {000000, 900000, 990000,
> + 999000, 999900, 999990, 999999};
>
> if (my_time->minute >= 60 || my_time->second >= 60)
> return 1;
>
> hour= my_time->hour + (24*my_time->day);
> +
> + if (dec == AUTO_SEC_PART_DIGITS)
> + dec= MAX_SEC_PART_DIGITS;
> +
> if (hour <= TIME_MAX_HOUR &&
> (hour != TIME_MAX_HOUR || my_time->minute != TIME_MAX_MINUTE ||
> + my_time->second != TIME_MAX_SECOND ||
> + my_time->second_part <= max_sec_part[dec]))
> return 0;
>
> my_time->day= 0;
> my_time->hour= TIME_MAX_HOUR;
> my_time->minute= TIME_MAX_MINUTE;
> my_time->second= TIME_MAX_SECOND;
> - my_time->second_part= 0;
> + my_time->second_part= TIME_MAX_SECOND_PART;
Shouldn't this be max_sec_part[dec] ?
(Otherwise you may return a bigger value than what the function was
called with)
> *warning|= MYSQL_TIME_WARN_OUT_OF_RANGE;
> return 0;
> }
<cut>
> +int number_to_time(double nr, MYSQL_TIME *ltime, int *was_cut)
> +{
> + ulong tmp;
> + *was_cut= 0;
> + ltime->year= ltime->month= ltime->day= 0;
> + ltime->time_type= MYSQL_TIMESTAMP_TIME;
> +
> + if ((ltime->neg= nr < 0))
> + nr= -nr;
> +
> + if (nr > TIME_MAX_VALUE)
> + {
> + nr= TIME_MAX_VALUE;
> + *was_cut= MYSQL_TIME_WARN_OUT_OF_RANGE;
> + }
> + tmp=(ulong)floor(nr);
> + ltime->hour = tmp/100/100;
> + ltime->minute= tmp/100%100;
> + ltime->second= tmp%100;
> + ltime->second_part= (ulong)((nr-tmp)*1e6);
Change 1e6 to a define like 'TIME_SECOND_PART_TO_INT_MULTIPLIER' here
and in all other places.
> + if (ltime->minute < 60 && ltime->second < 60)
> + return 0;
> +
> + *was_cut= MYSQL_TIME_WARN_TRUNCATED;
> + return 1;
> +}
<cut>
> === modified file 'sql/event_db_repository.cc'
> --- sql/event_db_repository.cc 2010-01-22 09:38:21 +0000
> +++ sql/event_db_repository.cc 2011-03-01 12:24:36 +0000
> @@ -231,6 +231,9 @@ mysql_event_fill_row(THD *thd,
> rs|= fields[ET_FIELD_STATUS]->store((longlong)et->status, TRUE);
> rs|= fields[ET_FIELD_ORIGINATOR]->store((longlong)et->originator, TRUE);
>
> + if (!is_update)
> + rs|= fields[ET_FIELD_CREATED]->set_time();
> +
Don't understand why this was moved from Event_db_repository::create_event().
Probably safe but it would be nice to know the reason.
(The bzr file comment didn't say anything about this)
> === modified file 'sql/event_queue.cc'
> --- sql/event_queue.cc 2008-05-09 07:43:02 +0000
> +++ sql/event_queue.cc 2011-03-01 12:24:36 +0000
> @@ -513,9 +513,10 @@ Event_queue::empty_queue()
> */
>
> void
> +Event_queue::dbug_dump_queue(my_time_t when)
> {
> #ifndef DBUG_OFF
> + my_time_t now= when;
> Event_queue_element *et;
> uint i;
> DBUG_ENTER("Event_queue::dbug_dump_queue");
> @@ -592,14 +593,13 @@ Event_queue::get_top_for_execution_if_ti
> thd->set_current_time(); /* Get current time */
>
> next_activation_at= top->execute_at;
> + if (next_activation_at >thd->query_start())
> {
> /*
> Not yet time for top event, wait on condition with
> time or until signaled. Release LOCK_queue while waiting.
> */
> - struct timespec top_time;
> - set_timespec(top_time, next_activation_at - thd->query_start());
> + struct timespec top_time= { next_activation_at, 0 };
Not sure if the above change is portable. That is why we have the
set_timespec() defines.
Any particular reason why you did the change ?
I would prefer to have the original code as we know it's works and
compiles on a lot of platforms.
However, after your definition change of my_getsystime() you have to
redefine all the macros on my_pthread.h that calls my_getsystime() to
call my_hrtime() instead as my_getsystime() can't anymore be used as
a start point for timespec.
If you want to keep the above change, you should also remove all the
set_timespec() defines from my_pthread.h and fix all other code that
uses this!
> cond_wait(thd, &top_time, queue_wait_msg, SCHED_FUNC, __LINE__);
>
> continue;
> === modified file 'sql/field.cc'
> @@ -3151,13 +3141,9 @@ int Field_short::store(const char *from,
>
> error= get_int(cs, from, len, &rnd, UINT_MAX16, INT_MIN16, INT_MAX16);
> store_tmp= unsigned_flag ? (int) (ulonglong) rnd : (int) rnd;
> -#ifdef WORDS_BIGENDIAN
> - if (table->s->db_low_byte_first)
> - {
> + if (ARCH_BIGENDIAN && table->s->db_low_byte_first)
> int2store(ptr, store_tmp);
> - }
The above code can produce warnings on some compilers with the error
'if is always true'. I also find it much harder to read and understand!
Please add back the original code and don't try to play smart games
with the compiler!
This is not part of microsecond patch and should not be done here.
Same goes for all other changes of WORDS_BIGENDIAN.
That said, we can probably remove the db_low_byte_first code in the
future as no engine uses this anymore. This should however be a
different patch!
> @@ -4822,136 +4682,136 @@ timestamp_auto_set_type Field_timestamp:
> }
> }
>
> +long Field_timestamp::get_timestamp(ulong *sec_part) const
> +{
> + ASSERT_COLUMN_MARKED_FOR_READ;
> + *sec_part= 0;
> + if (ARCH_BIGENDIAN && table && table->s->db_low_byte_first)
> + return sint4korr(ptr);
> + long tmp;
> + longget(tmp,ptr);
> + return tmp;
> +}
>
Shouldn't this be of type 'my_time_t' for completeness?
> -int Field_timestamp::store(const char *from,uint len,CHARSET_INFO *cs)
> +int Field_timestamp::store_TIME_with_warning(THD *thd, MYSQL_TIME *l_time,
> + const Lazy_string *str,
> + bool was_cut, bool have_smth_to_conv)
> {
> ASSERT_COLUMN_MARKED_FOR_WRITE;
> + uint error = 0;
> + my_time_t timestamp;
> my_bool in_dst_time_gap;
> + if (was_cut || !have_smth_to_conv)
> {
> error= 1;
> set_datetime_warning(MYSQL_ERROR::WARN_LEVEL_WARN, WARN_DATA_TRUNCATED,
> + str, MYSQL_TIMESTAMP_DATETIME, 1);
> }
Add a comment here. The following test is not clear
> + if (have_smth_to_conv && l_time->month)
> {
> + if (!(timestamp= TIME_to_timestamp(thd, l_time, &in_dst_time_gap)))
> {
> set_datetime_warning(MYSQL_ERROR::WARN_LEVEL_WARN,
> ER_WARN_DATA_OUT_OF_RANGE,
> + str, MYSQL_TIMESTAMP_DATETIME, !error);
> error= 1;
> }
> else if (in_dst_time_gap)
> {
> set_datetime_warning(MYSQL_ERROR::WARN_LEVEL_WARN,
> ER_WARN_INVALID_TIMESTAMP,
> + str, MYSQL_TIMESTAMP_DATETIME, !error);
> error= 1;
> }
> }
As we are fixing things, it would be nice to change the two above 'if'
in the common case to one if. We could do it by changing the last
parameter in TIME_to_timestamp() to an enum and do:
timestamp= TIME_to_timestamp(thd, l_time, &timstamp_error)
if (timestamp_error)
{
set_datetime_warning(MYSQL_ERROR::WARN_LEVEL_WARN,
(timestamp_error ==
MYSQL_TIMESTAMP_IN_DST_TIME_GAP ?
ER_WARN_INVALID_TIMESTAMP :
ER_WARN_DATA_OUT_OF_RANGE),
str, MYSQL_TIMESTAMP_DATETIME, !error);
error= 1;
}
Less code and only one if...
> + else
> + {
> + timestamp= 0;
> + l_time->second_part= 0;
> + }
> + store_TIME(timestamp, l_time->second_part);
> return error;
> }
> +int Field_timestamp::store_time(MYSQL_TIME *ltime,timestamp_type time_type)
> +{
> + THD *thd= table ? table->in_use : current_thd;
You can assume that table->in_use is always set.
(Other place in the code already assumes that).
Feel free to fix this everywhere in field.cc
> + int unused;
> + MYSQL_TIME l_time= *ltime;
> + Lazy_string_time str(ltime);
> + bool valid= !check_date(&l_time, pack_time(&l_time) != 0,
> + (thd->variables.sql_mode & MODE_NO_ZERO_DATE) |
> + MODE_NO_ZERO_IN_DATE, &unused);
> +
> + return store_TIME_with_warning(thd, &l_time, &str, false, valid);
> +}
> +
Add an empty line here
> +int Field_timestamp::store(const char *from,uint len,CHARSET_INFO *cs)
> +{
<cut>
> int Field_timestamp::store(double nr)
> + MYSQL_TIME l_time;
> + int error;
> + Lazy_string_dbl str(nr);
> + THD *thd= table ? table->in_use : current_thd;
> +
> + /* We don't want to store invalid or fuzzy datetime values in TIMESTAMP */
> + longlong tmp= number_to_datetime((longlong) floor(nr),
> + &l_time, (thd->variables.sql_mode &
> + MODE_NO_ZERO_DATE) |
> + MODE_NO_ZERO_IN_DATE, &error);
> + l_time.second_part= (ulong)((nr-floor(nr))*1e6);
> + return store_TIME_with_warning(thd, &l_time, &str, error, tmp != LL(-1));
> }
You should add back the test:
if (nr < 0 || nr > 99991231235959.9999999)
the reason is that on some machines you get a trap or wrong result if
you try to convert a too big double to a longlong!
> longlong Field_timestamp::val_int(void)
> {
> MYSQL_TIME time_tmp;
> THD *thd= table ? table->in_use : current_thd;
>
> thd->time_zone_used= 1;
> ulong sec_part;
> uint32 temp= get_timestamp(&sec_part);
Add a comment here:
/*
Field_timestamp() and Field_timestamp_hres() shares this code.
This is why are also testing sec_part below.
*/
> if (temp == 0 && sec_part == 0)
> return(0);
<cut>
> longlong Field_timestamp::val_int(void)
> {
> MYSQL_TIME time_tmp;
> THD *thd= table ? table->in_use : current_thd;
Please add:
ASSERT_COLUMN_MARKED_FOR_READ;
> +static const char *zero_timestamp="0000-00-00 00:00:00.000000";
Move to start of file.
> String *Field_timestamp::val_str(String *val_buffer, String *val_ptr)
> {
> - ASSERT_COLUMN_MARKED_FOR_READ;
Add back ASSERT_COLUMN_MARKED_FOR_READ;
<cut>
> -void Field_timestamp::set_time()
> +int Field_timestamp::set_time()
> {
> THD *thd= table ? table->in_use : current_thd;
> - long tmp= (long) thd->query_start();
> set_notnull();
> - store_timestamp(tmp);
> + store_TIME(thd->query_start(), 0);
> + return 0;
> }
The set_time() function could be a void as it can never fail.
(Yes, I agree that it means that we have to test type of field in
event code, but that something we should do at event startup anyway...)
<cut>
> +static void store_native(ulonglong num, uchar *to, uint bytes)
> {
> + switch(bytes) {
> + case 1: *to= (uchar)num; break;
> + case 2: shortstore(to, (ushort)num); break;
> + case 3: int3store(to, num); /* Sic!*/ break;
> + case 4: longstore(to, (ulong)num); break;
> + case 8: longlongstore(to, num); break;
> + default: DBUG_ASSERT(0);
> + }
> +}
What would be even better if each field would have a pointers to c
functions with the right read and write functions; Would avoid a
switch for every call...
<cut>
> +static uint sec_part_bytes[MAX_DATETIME_PRECISION+1]= { 0, 1, 1, 2, 2, 3, 3 };
Move to start of file and document this.
<cut>
> + if (!tmp)
> + return fuzzydate & TIME_NO_ZERO_DATE;
should be:
> + return (fuzzydate & TIME_NO_ZERO_DATE) != 0;
as the function is a bool function
> +bool Field_datetime_hires::get_date(MYSQL_TIME *ltime, uint fuzzydate)
> +{
> + ulonglong packed= read_bigendian(ptr, Field_datetime_hires::pack_length());
> + unpack_time(sec_part_unshift(packed, dec), ltime);
> + if (!packed)
> + return fuzzydate & TIME_NO_ZERO_DATE;
Should be:
return (fuzzydate & TIME_NO_ZERO_DATE) != 0;
> + if (!ltime->month || !ltime->day)
> + return !(fuzzydate & TIME_FUZZY_DATE);
> + return 0;
> +}
> +
> +uint32 Field_datetime_hires::pack_length() const
> +{
> + return 5 + sec_part_bytes[dec];
> +}
move to include file.
<cut>
> @@ -8217,7 +8020,7 @@ Field_blob::unpack_key(uchar *to, const
> length+= *from++ << 8;
>
> /* put the length into the record buffer */
> - put_length(to, length);
> + store_length(to, packlength, length, table->s->db_low_byte_first);
Note that the old format was not depending on table->s->db_low_byte_first !
it should alway stored in low byte first order!
Change above to store_lowendian()
> === modified file 'sql/field.h'
> --- sql/field.h 2010-03-17 18:15:41 +0000
> +++ sql/field.h 2011-03-24 11:30:03 +0000
> @@ -22,8 +22,13 @@
> #pragma interface /* gcc class implementation */
> #endif
>
> +#ifdef WORDS_ARCH_BIGENDIAN
> +#define ARCH_BIGENDIAN 1
> +#else
> +#define ARCH_BIGENDIAN 0
> +#endif
> +
<cut>
> class Field_datetime :public Field_temporal {
...
> enum Item_result cmp_type () const { return TIME_RESULT; }
remove cmp_type() as it's already defined identically by Field_temporal
> === modified file 'sql/item.cc'
> --- sql/item.cc 2010-11-18 13:11:18 +0000
> +++ sql/item.cc 2011-03-24 14:55:52 +0000
> @@ -461,6 +461,34 @@ void Item::print_item_w_name(String *str
> }
>
>
> +void Item::print_value(String *str)
> +{
> + char buff[MAX_FIELD_WIDTH];
> + String *ptr, tmp(buff,sizeof(buff),str->charset());
> + ptr= val_str(&tmp);
> + if (!ptr)
> + str->append("NULL");
> + else
> + {
> + switch (result_type())
> + {
> + default:
> + DBUG_ASSERT(0);
> + case STRING_RESULT:
> + str->append('\'');
> + str->append(*ptr);
> + str->append('\'');
Shouldn't we also escape \' ?
(As we never know what the print is used for)
Doing 'append_unescaped(str, ptr->ptr(), ptr->length())' instead of
above would fix that.
(Yes, I know that the original code didn't do that, but that was also
a bug).
> + break;
> + case DECIMAL_RESULT:
> + case REAL_RESULT:
> + case INT_RESULT:
> + str->append(*ptr);
> + break;
> + }
Shouldn't you have TIME_RESULT: here too ?
If this never happens, you should at least have this as part of
default to make it clear it wasn't forgotten.
(Actually removing the default would be a good idea as then we will
get a compiler warning if someone adds a new result type but don't
take care if it here).
> + }
> +}
> +
> +
> void Item::cleanup()
> {
> DBUG_ENTER("Item::cleanup");
> @@ -503,6 +531,45 @@ void Item::rename(char *new_name)
> name= new_name;
> }
>
> +Item_result Item::cmp_type() const
> +{
> + switch(field_type()) {
Add space after switch
> bool Item::get_time(MYSQL_TIME *ltime)
> {
> return get_date(ltime, TIME_TIME_ONLY);
> }
You should probably add TIME_FUZZY_DATE to the above. This should fix
the error that I have described at the end of the review.
<cut>
> /**
> Traverse item tree possibly transforming it (replacing items).
> @@ -915,13 +982,15 @@ bool Item_string::eq(const Item *item, b
>
> bool Item::get_date(MYSQL_TIME *ltime,uint fuzzydate)
> {
> - if (result_type() == STRING_RESULT)
> + if (field_type() == MYSQL_TYPE_TIME)
> + fuzzydate|= TIME_TIME_ONLY;
> + if (result_type() == STRING_RESULT || fuzzydate & TIME_TIME_ONLY)
> {
Please add a comment under what conditions or why field_type() can be
MYSQL_TYPE_TIME and result_type() is STRING_RESULT.
(Looks like this only happens for Item_field(Field_time))
I am actually a bit confused by the fact that only Item_field can have
TIME_RESULT. Why doesn't all time functions say they have TIME_RESULT?
> @@ -2698,22 +2750,23 @@ void Item_param::set_time(MYSQL_TIME *tm
> value.time= *tm;
> value.time.time_type= time_type;
>
> + decimals= value.time.second_part > 0 ? MAX_SEC_PART_DIGITS : 0;
> +
> if (value.time.year > 9999 || value.time.month > 12 ||
> value.time.day > 31 ||
> (time_type != MYSQL_TIMESTAMP_TIME && value.time.hour > 23) ||
> - value.time.minute > 59 || value.time.second > 59)
> + value.time.minute > 59 || value.time.second > 59 ||
> + value.time.second_part > MAX_SEC_PART_VALUE)
> {
> + Lazy_string_time str(&value.time);
> make_truncated_value_warning(current_thd, MYSQL_ERROR::WARN_LEVEL_WARN,
> + &str, time_type, 0);
> set_zero_time(&value.time, MYSQL_TIMESTAMP_ERROR);
> }
>
> state= TIME_VALUE;
> maybe_null= 0;
> max_length= max_length_arg;
> - decimals= 0;
> DBUG_VOID_RETURN;
Please move setting of decimal to end of function (where the old
setting was).
all Item_param::set... functions first checks the arguments and then
sets all values at the end (in the same place). (Just makes the
function easier to read)
> }
>
> @@ -2790,10 +2843,12 @@ bool Item_param::set_from_user_var(THD *
> case REAL_RESULT:
> set_double(*(double*)entry->value);
> item_type= Item::REAL_ITEM;
> + param_type= MYSQL_TYPE_DOUBLE;
> break;
Here you set the param_type, but I wonder if there could be a problem
that the param_type is different over two execution of a prepared
statement.
I tried some things to see if there is a problem, but couldn't see any
behavior changes because of this, so this change may be safe.
What did this change fix?
(At at least a code comment why param_type is reset here)
> @@ -4651,12 +4693,7 @@ Item *Item_field::equal_fields_propagato
> item= this;
> else if (field && (field->flags & ZEROFILL_FLAG) && IS_NUM(field->type()))
> {
> - if (item && field->type() != FIELD_TYPE_TIMESTAMP && cmp_context != INT_RESULT)
> + if (item && (cmp_context == STRING_RESULT || cmp_context == (Item_result)-1))
When adding to 5.3, don't forget to change to use IMPOSSIBLE_RESULT
> @@ -6852,11 +6879,19 @@ void resolve_const_item(THD *thd, Item *
<cut>
> + case TIME_RESULT:
> + {
> + bool is_null;
> + Item **ref_copy= ref;
Please add a short comment why we don't create an Item_datetime value
here (as this is what we do in all the other cases)
> + get_datetime_value(thd, &ref_copy, &new_item, comp_item, &is_null);
> + if (is_null)
> + new_item= new Item_null(name);
> + break;
> + }
> @@ -7009,6 +7047,21 @@ int stored_field_cmp_to_item(THD *thd, F
> field_val= field->val_decimal(&field_buf);
> return my_decimal_cmp(item_val, field_val);
> }
Add comment:
/*
We have to check field->cmp_type() here as res_type is TIME_RESULT
if either field or item is of type TIME_RESULT
*/
> + if (field->cmp_type() == TIME_RESULT)
> + {
> + MYSQL_TIME field_time, item_time;
Fix indentation (should be 2 space, not 4)
> + if (field->type() == MYSQL_TYPE_TIME)
> + {
> + field->get_time(&field_time);
> + item->get_time(&item_time);
> + }
> + else
> + {
> + field->get_date(&field_time, TIME_FUZZY_DATE | TIME_INVALID_DATES);
> + item->get_date(&item_time, TIME_FUZZY_DATE | TIME_INVALID_DATES);
> + }
> + return my_time_compare(&field_time, &item_time);
> + }
Looking at the code, it would probably be simpler if we replaced all
the 'if (res_type)' with a swtich.
> @@ -7048,6 +7101,8 @@ Item_cache* Item_cache::get_cache(const
> return new Item_cache_str(item);
> case ROW_RESULT:
> return new Item_cache_row();
> + case TIME_RESULT:
> + return new Item_cache_int(MYSQL_TYPE_DATETIME);
Shouldn't this be a double or decimal as the time can contain microseconds?
Looks like a potential bug!
> === modified file 'sql/item.h'
> --- sql/item.h 2010-07-30 13:35:06 +0000
> +++ sql/item.h 2011-03-24 11:30:03 +0000
> @@ -569,7 +569,8 @@ class Item {
> virtual bool send(Protocol *protocol, String *str);
> virtual bool eq(const Item *, bool binary_cmp) const;
> virtual Item_result result_type() const { return REAL_RESULT; }
> - virtual Item_result cast_to_int_type() const { return result_type(); }
> + virtual Item_result cmp_type() const;
Please add documentation what's the difference between cmp_type() and
result_type() and when one should choose which.
By the way, you can remove Field->cast_to_int_type() as this is
always same as field->result_type().
I also noticed that you have a Field->cmp_type() but there is no
Item_field->cmp_type() mapped to Field->cmp_type(). I couldn't
As I assume that for all fields the following should hold:
Item_field->cmp_type() == Item_field->field->cmp_type()
If not that needs to be changed or properly documented as otherwise
things are very confusing!
> @@ -3031,7 +3022,7 @@ class Item_cache_int: public Item_cache
> protected:
> longlong value;
> public:
> - Item_cache_int(): Item_cache(),
> + Item_cache_int(): Item_cache(MYSQL_TYPE_LONGLONG),
> value(0) {}
> Item_cache_int(enum_field_types field_type_arg):
> Item_cache(field_type_arg), value(0) {}
> @@ -3043,8 +3034,13 @@ class Item_cache_int: public Item_cache
> String* val_str(String *str);
> my_decimal *val_decimal(my_decimal *);
> enum Item_result result_type() const { return INT_RESULT; }
> - bool result_as_longlong() { return TRUE; }
> bool cache_value();
> + Item *clone_item()
> + {
> + Item_cache_int *item= new Item_cache_int(cached_field_type);
> + item->store(this, value);
> + return item;
> + }
> };
Don't understand why Item_cache_int() needs a clone_item when
Item_cache_real() or Item_cache_decimal() dosn't have one.
If this was needed, please check if the other Item_cache_xxx()
functions needs it.
> === modified file 'sql/item_cmpfunc.cc'
<cut>
Please add documentation for this function:
> @@ -891,65 +713,15 @@ int Arg_comparator::set_cmp_func(Item_re
> Item **a1, Item **a2,
> Item_result type)
<cut>
> @@ -1026,9 +761,14 @@ Item** Arg_comparator::cache_converted_c
> Item **cache_item,
> Item_result type)
> {
> - /* Don't need cache if doing context analysis only. */
> + /*
> + Don't need cache if doing context analysis only.
> + Also, get_datetime_value creates Item_cache internally.
> + Unless fixed, we should not do it here.
> + */
> if (!thd_arg->is_context_analysis_only() &&
> - (*value)->const_item() && type != (*value)->result_type())
> + (*value)->const_item() && type != (*value)->result_type() &&
> + type != TIME_RESULT)
> {
> Item_cache *cache= Item_cache::get_cache(*value, type);
> cache->setup(*value);
Why don't we cache it there instead of doing it in get_datetime_value()?
This would save us one if and 3 test for every call to
get_datetime_value().
You could here call get_datetime_value() and use this value in the
cache. If this doesn't easily work (as indicated on IRC), at least
expand the comment to explain the issue so that we can later try to
fix it in a better way.
> @@ -1071,81 +806,120 @@ void Arg_comparator::set_datetime_cmp_fu
> DESCRIPTION
> Retrieves the correct DATETIME value from given item for comparison by the
> compare_datetime() function.
> +
> + If the value should be compared as time (TIME_RESULT), it's retrieved as
> + MYSQL_TIME. Otherwise it's read as a number/string and converted to time.
> + Constant items are cached, so the convertion is only done once for them.
> +
> + Note the f_type behavior: if the item can be compared as time, then
> + f_type is this item's field_type(). Otherwise it's field_type() of
> + warn_item (which is the other operand of the comparison operator).
> + This logic provides correct string/number to date/time conversion
> + depending on the other operand (when comparing a string with a date, it's
> + parsed as a date, when comparing a string with a time it's parsed as a time)
>
> RETURN
> + MYSQL_TIME value, packed in a longlong, suitable for comparison.
> */
>
Please add the following to the comment:
For example, if we are executing
WHERE datetime_6_column = 20010101000001.000020
for the first call the 'number-item' will have item->cmp_type ==
REAL_RESULT. It will create an item_cache with cmp_type() ==
TIME_RESULT and item->result_type() == INT_RESULT which will replace
the item.
No other item than an item_cache item will have cmp_type() ==
TIME_RESULT and result_type() == INT_RESULT, which makes the
TIME_RESULT case safe.
> longlong
> get_datetime_value(THD *thd, Item ***item_arg, Item **cache_arg,
> Item *warn_item, bool *is_null)
> {
> + longlong UNINIT_VAR(value);
> Item *item= **item_arg;
> + enum_field_types f_type= warn_item->field_type();
> + timestamp_type t_type=
> + f_type == MYSQL_TYPE_DATE ? MYSQL_TIMESTAMP_DATE :
> + f_type == MYSQL_TYPE_TIME ? MYSQL_TIMESTAMP_TIME :
> + MYSQL_TIMESTAMP_DATETIME;
> +
> + switch (item->cmp_type()) {
> + case TIME_RESULT:
> + /* if it's our Item_cache_int, as created below, we simply use the value */
> + if (item->result_type() == INT_RESULT)
> + value= item->val_int();
Please also set cache_item= 0 to make the end function test simpler.
<cut>
> + default: DBUG_ASSERT(0);
> }
> - if (*is_null)
> + if ((*is_null= item->null_value))
> return ~(ulonglong) 0;
> + if (cache_arg && item->const_item() && item->type() != Item::CACHE_ITEM)
> {
With my suggested change above, you can remove test of CACHE_ITEM.
Another option would be to add to function start:
if (item->type() == Item::CACHE_ITEM)
{
value= item->val_int();
*is_null= item->null_value;
return;
}
Which would save us the calcuation of f_type and t_type for all cached
objects.
I would also suggest delaying calcuation of t_type until it's needed
as in most cases we will not need it (constant items or fields)
<cut>
> +int Arg_comparator::compare_e_datetime()
> +{
> + bool a_is_null, b_is_null;
> + longlong a_value, b_value;
> +
> + if (set_null)
> + owner->null_value= 0;
Remove the above set; No other compare_e... function sets null_value
> int Arg_comparator::compare_string()
> {
> @@ -2247,9 +1963,7 @@ void Item_func_between::fix_length_and_d
> {
> max_length= 1;
> int i;
> - bool datetime_found= FALSE;
> - int time_items_found= 0;
> - compare_as_dates= TRUE;
> + compare_as_dates= 0;
> THD *thd= current_thd;
Please move declarations first and assignment of variables later.
(Even better, move 'int i' to the block where it's used)
> /*
> @@ -2265,43 +1979,33 @@ void Item_func_between::fix_length_and_d
> return;
>
> /*
> + When comparing as date/time, we need to convert non-temporal values
> + (e.g. strings) to MYSQL_TIME. get_datetime_value() doees it
doees -> does
> + automatically when one of the operands is a date/time. But here we
> + may need to compare two strings as dates (str1 BETWEEN str2 AND date).
> + For this to work, we need to know what date/time type we compare
> + strings as.
> */
> + if (cmp_type == TIME_RESULT)
> {
> for (i= 0; i < 3; i++)
> {
> + if (args[i]->cmp_type() == TIME_RESULT)
> {
> - datetime_found= TRUE;
> + if (args[i]->field_type() != MYSQL_TYPE_TIME ||
> + (args[i]->field_type() == MYSQL_TYPE_TIME && compare_as_dates==0))
> + compare_as_dates= args[i];
> continue;
> }
> }
> }
I find it strange that you set compare_as_dates to the last not
MYSQL_TYPE_TIME column as this is used to give warnings as the
warning may in theory come from some other field.
I tried to make a test case to show the problem but didn't succeed :(
> @@ -2319,27 +2023,46 @@ void Item_func_between::fix_length_and_d
> longlong Item_func_between::val_int()
> { // ANSI BETWEEN
> DBUG_ASSERT(fixed == 1);
> - if (compare_as_dates)
> +
> + switch (cmp_type) {
> + case TIME_RESULT:
> {
> - int ge_res, le_res;
> + THD *thd= current_thd;
> + longlong value, a, b;
> + Item *cache, **ptr;
> + bool value_is_null, a_is_null, b_is_null;
> +
Add a comment here that it's ok to have cache and ptr on stack as if
we created a newcache item, it will be taken care of by the call to
change_item_tree().
<cut>
> + if (!a_is_null && !b_is_null)
> + return (longlong) ((value >= a && value <= b) != negated);
> + if (a_is_null && b_is_null)
> + null_value=1;
> + else if (a_is_null)
> + null_value= value <= b; // not null if false range.
> else
> + null_value= value >= a;
> + break;
A slightly simpler if:
if (a_is_null)
null_value= b_is_null || value <= b; // not null if false range.
else
null_value= value >= a;
<cut>
> @@ -3193,6 +2940,13 @@ void Item_func_coalesce::fix_length_and_
> {
> cached_field_type= agg_field_type(args, arg_count);
> agg_result_type(&hybrid_type, args, arg_count);
> + Item_result cmp_type;
> + agg_cmp_type(&cmp_type, args, arg_count);
> + if (cmp_type == TIME_RESULT)
> + {
> + count_real_length();
> + return;
> + }
Add a comment that this code is not needed when we have TIME_RESULT
also as a result type.
> === modified file 'sql/item_create.cc'
> --- sql/item_create.cc 2010-07-02 18:30:47 +0000
> +++ sql/item_create.cc 2011-03-08 18:41:58 +0000
> @@ -5044,6 +5044,14 @@ find_qualified_function_builder(THD *thd
> return & Create_sp_func::s_singleton;
> }
>
> +static inline const char* item_name(Item *a, String *str)
> +{
> + if (a->name)
> + return a->name;
> + str->length(0);
> + a->print(str, QT_ORDINARY);
> + return str->c_ptr_safe();
> +}
Add a comment that 'item_name()' always returns a \0 ended string.
> @@ -5051,6 +5059,8 @@ create_func_cast(THD *thd, Item *a, Cast
> CHARSET_INFO *cs)
<cut>
> case ITEM_CAST_DATETIME:
> - res= new (thd->mem_root) Item_datetime_typecast(a);
> + {
> + uint len;
> + if (c_len)
> + {
> + errno= 0;
> + len= strtoul(c_len, NULL, 10);
Note that strtoul() on error doesn't set errno but my_errno (if we are using
using our version of strtoul()).
Anyway you don't have to check errno as it will return UTYPE_MAX on
error, which is bigger than MAX_DATETIME_PRECISION.
In general, it's better to use my_strtoll10() than strtoul() because
this is faster and it returns the error in a variable that is easier
to check than errno.
Please also fix all other calls to strtoul() in this function
> + if (errno != 0 || len > MAX_DATETIME_PRECISION)
> + {
> + my_error(ER_TOO_BIG_PRECISION, MYF(0), len,
> + item_name(a, &buf), MAX_DATETIME_PRECISION);
> + return NULL;
> + }
> + }
> + else
> + len= 0;
Set len= 0 before the 'if'. This is faster and less code lines.
> case ITEM_CAST_DECIMAL:
> {
> ulong len= 0;
> @@ -5083,8 +5111,8 @@ create_func_cast(THD *thd, Item *a, Cast
> decoded_size= strtoul(c_len, NULL, 10);
> if (errno != 0)
> {
> - my_error(ER_TOO_BIG_PRECISION, MYF(0), c_len, a->name,
> - DECIMAL_MAX_PRECISION);
> + my_error(ER_TOO_BIG_PRECISION, MYF(0), decoded_size,
> + item_name(a, &buf), DECIMAL_MAX_PRECISION);
Good bug fix, but still not perfect as the value could be a longlong and
then you get a strange number in the result. Best would be to print
the result as ulonglong, but that would require us to add another
error message which is not optimal either.
Example:
select cast('2001-01-01' as decimal(99999999999999));
->
ERROR 1426 (42000): Too big precision 276447231 specified for column '2001-01-01'. Maximum is 65
<cut>
> @@ -5135,7 +5163,7 @@ create_func_cast(THD *thd, Item *a, Cast
> decoded_size= strtoul(c_len, NULL, 10);
> if ((errno != 0) || (decoded_size > MAX_FIELD_BLOBLENGTH))
> {
> - my_error(ER_TOO_BIG_DISPLAYWIDTH, MYF(0), "cast as char", MAX_FIELD_BLOBLENGTH);
> + my_error(ER_TOO_BIG_DISPLAYWIDTH, MYF(0), "cast as char", MAX_FIELD_BLOBLENGTH);
> return NULL;
> }
> len= (int) decoded_size;
>
> === modified file 'sql/item_func.cc'
> --- sql/item_func.cc 2010-11-23 13:08:10 +0000
> +++ sql/item_func.cc 2011-03-19 13:54:46 +0000
> @@ -940,8 +940,7 @@ longlong Item_func_signed::val_int()
> longlong value;
> int error;
>
> - if (args[0]->cast_to_int_type() != STRING_RESULT ||
> - args[0]->result_as_longlong())
> + if (args[0]->cast_to_int_type() != STRING_RESULT)
> {
> value= args[0]->val_int();
> null_value= args[0]->null_value;
> @@ -982,8 +981,7 @@ longlong Item_func_unsigned::val_int()
> value= 0;
> return value;
> }
> - else if (args[0]->cast_to_int_type() != STRING_RESULT ||
> - args[0]->result_as_longlong())
> + else if (args[0]->cast_to_int_type() != STRING_RESULT)
> {
> value= args[0]->val_int();
> null_value= args[0]->null_value;
> @@ -2220,10 +2218,10 @@ double Item_func_units::val_real()
> void Item_func_min_max::fix_length_and_dec()
> {
> int max_int_part=0;
> decimals=0;
> max_length=0;
> maybe_null=0;
> + thd= current_thd;
We should consider in the future adding thd as an argument to
fix_length_and_dec(). The alternative is that in the functions one
wants to access thd, one implements fix_fields() that just sets thd and
call parent fix_fields().
As almost all Item_date_func's uses thd, it may be a good idea to add
thd to this item and set it in fix_fields(). This would remove a LOT
of calls to current_thd.
<cut>
> +bool Item_func_min_max::get_date(MYSQL_TIME *ltime, uint fuzzy_date)
> {
> longlong UNINIT_VAR(min_max);
> + DBUG_ASSERT(fixed == 1);
> + if (!compare_as_dates)
> + return Item_func::get_date(ltime, fuzzy_date);
Add a comment in which case the above may not be true.
>
> for (uint i=0; i < arg_count ; i++)
> {
> Item **arg= args + i;
> bool is_null;
> + longlong res= get_datetime_value(thd, &arg, 0, compare_as_dates, &is_null);
>
> /* Check if we need to stop (because of error or KILL) and stop the loop */
Using 'compare_as_dates' can give a strange warning:
select greatest(cast("2001-01-5" as date), "2002-02-7aaa");
->
Incorrect date value: '2002-02-7aaa' for column 'cast("2001-01-5" as date)' at row 1
In cases like this when we don't store the value in a field, it would
may be better to just give an error:
Incorrect date value: '2002-02-7aaa' for function 'greatest' at row 1
We could get the name by just sending 'this' to the get_datetime_value
instead of a found argument and we could choose error message based if
the 'warn_item->real_item()->type()' is a FIELD_ITEM or not.
> if (thd->is_error())
> {
> null_value= 1;
> + return 1;
> }
>
> if ((null_value= args[i]->null_value))
> + return 1;
You could combine the above two tests to:
if (thd->is_error() || args[i]->null_value)
{
null_value= 1;
return 1;
}
> if (i == 0 || (res < min_max ? cmp_sign : -cmp_sign) > 0)
> min_max= res;
> }
> + unpack_time(min_max, ltime);
> + if (compare_as_dates->field_type() == MYSQL_TYPE_DATE)
> + ltime->time_type= MYSQL_TIMESTAMP_DATE;
I would rather cache the correct value on upper level and store it in
ltime->time_type directly.
The above test also looks a bit suspicion if you would compare date
with time.
greatest(cast("2001-01-5" as date), cast("10:20:05" as time))
->
2001-01-05
another strange effect is:
greatest(cast("0-0-0" as date), cast("10:20:05" as time))
->
0000-00-00
but:
greatest(cast("0-0-0" as date), cast("10:20:05" as time)) = "0000-00-00"
->
0
(Which is wrong)
compare to:
concat(greatest(cast("0-0-0" as date), cast("10:20:05" as time))) =
"0000-00-00"
->
true
This shows what is really happening:
cast(greatest(cast("0-0-0" as date), cast("10:20:05" as time)) as
datetime(6))
->
0000-00-00 10:20:05.000000
Suggested fix:
a) Do as before, when comparing datetime or date with time, convert
to time, not datetime.
b) Give an error when comparing date or datetime with time.
c) When doing compare of date and time, upgrade type to compare to
DATETIME.
> @@ -2319,19 +2308,14 @@ String *Item_func_min_max::val_str(Strin
> DBUG_ASSERT(fixed == 1);
> if (compare_as_dates)
> {
> + MYSQL_TIME ltime;
> + if (get_date(<ime, TIME_FUZZY_DATE))
> return 0;
> + char buf[MAX_DATE_STRING_REP_LENGTH];
> + int len= my_TIME_to_str(<ime, buf, decimals);
> + str->copy(buf, len, collation.collation);
> + return str;
A better way would be to to do:
str->alloc(MAX_DATE_STRING_REP_LENGTH);
str->set_charset(collation.collation).
str->length(my_TIME_to_str(<ime, str->ptr(), decimals));
This avoids the extra strcpy()
(One should also check the return value for alloc() and copy() here).
You could also make item_timefunc.cc:make_datetime() public and just
call this.
<cut>
> @@ -3979,7 +3968,7 @@ double user_var_entry::val_real(my_bool
> }
> case STRING_RESULT:
> return my_atof(value); // This is null terminated
> - case ROW_RESULT:
> + default:
Add instead ROW_RESULT and TIME_RESULT here.
This is safer as it's less chance we miss something when we add more
result types later.
> DBUG_ASSERT(1); // Impossible
> break;
> }
> @@ -4010,7 +3999,7 @@ longlong user_var_entry::val_int(my_bool
> int error;
> return my_strtoll10(value, (char**) 0, &error);// String is null terminated
> }
> - case ROW_RESULT:
> + default:
Same as above.
> DBUG_ASSERT(1); // Impossible
> break;
> }
> @@ -4042,7 +4031,7 @@ String *user_var_entry::val_str(my_bool
> case STRING_RESULT:
> if (str->copy(value, length, collation.collation))
> str= 0; // EOM error
> - case ROW_RESULT:
> + default:
> DBUG_ASSERT(1); // Impossible
> break;
Same as above.
> }
> @@ -4069,7 +4058,7 @@ my_decimal *user_var_entry::val_decimal(
> case STRING_RESULT:
> str2my_decimal(E_DEC_FATAL_ERROR, value, length, collation.collation, val);
> break;
> - case ROW_RESULT:
> + default:
Same as above.
> DBUG_ASSERT(1); // Impossible
> break;
> }
>
> === modified file 'sql/item_timefunc.cc'
<cut>
> -static bool sec_to_time(longlong seconds, bool unsigned_flag, MYSQL_TIME *ltime)
> +static bool sec_to_time(double seconds, MYSQL_TIME *ltime)
> {
> uint sec;
>
> bzero((char *)ltime, sizeof(*ltime));
> +
> + ltime->time_type= MYSQL_TIMESTAMP_TIME;
>
> if (seconds < 0)
> {
> - if (unsigned_flag)
> - goto overflow;
> ltime->neg= 1;
> if (seconds < -3020399)
> goto overflow;
Shouldn't this be:
if (seconds < -3020399.999999)
Same for the other test.
(Instead of a constant number this should of course be TIME_MAX_VALUE_SECONDS +
TIME_MAX_SECOND_PART/1e6)
Add also tests for this in the test suite!
> @@ -211,9 +95,8 @@ static bool sec_to_time(longlong seconds
> ltime->minute= TIME_MAX_MINUTE;
> ltime->second= TIME_MAX_SECOND;
>
> - char buf[22];
> - int len= (int)(longlong10_to_str(seconds, buf, unsigned_flag ? 10 : -10)
> - - buf);
> + char buf[100];
> + uint len= snprintf(buf, sizeof(buf), "%.80g", ltime->neg ? -seconds: seconds);
I don't think we want to have 80 digit precission here; Looks
strange.; 30 should be more than enough.
<cut>
> @@ -578,6 +464,10 @@ static bool extract_date_time(DATE_TIME_
> l_time->minute > 59 || l_time->second > 59)
> goto err;
>
> + if ((fuzzy_date & TIME_NO_ZERO_DATE) &&
> + (l_time->year == 0 || l_time->month == 0 || l_time->day == 0))
> + goto err;
> +
This is something we didn't check before. If this is a compatibily
issue, you should list it in your matrix.
I tried to get the above to execute, but could not. I tried with:
select str_to_date("1997-00-04 22:23:00","%Y-%m-%D");
but this returned "1997-00-04" independent of the sql mode.
Please add a comment when the above case will be used.
> @@ -1569,8 +1459,9 @@ void Item_func_curdate_local::store_now_
> */
> void Item_func_curdate_utc::store_now_in_TIME(MYSQL_TIME *now_time)
> {
> - my_tz_UTC->gmt_sec_to_TIME(now_time,
> - (my_time_t)(current_thd->query_start()));
> + THD *thd= current_thd;
> + my_tz_UTC->gmt_sec_to_TIME(now_time, thd->query_start());
> + thd->time_zone_used= 1;
> /*
> We are not flagging this query as using time zone, since it uses fixed
> UTC-SYSTEM time-zone.
Why do you flag the time_zone_used when the function comment says that
you don't need to do this?
You also don't set time_zone_used in Item_func_now_utc::store_now_in_TIME()
<cut>
<cut>
> bool Item_func_sysdate_local::get_date(MYSQL_TIME *res,
> uint fuzzy_date __attribute__((unused)))
> {
> + MYSQL_TIME ltime;
> store_now_in_TIME(<ime);
> *res= ltime;
> return 0;
> }
Why not do directly:
store_now_in_TIME(res);
Actaully, I don't undestand why Item_func_sysdate_local() have to
have a :store_now_in_TIME() function at all. As this function can only be
called by a Item_func_sysdate_local object and we only call it from
this place it would be more logical to have the code from
store_now_in_time() here.
It actually doesn't make sense that this function inherits from
Item_func_now() as we get an extra store_now_in_TIME() call from
Item_func_now::fix_length_and_dec() that we don't need or want.
Not critical to do it now, but should be done eventually.
<cut>
> @@ -2050,42 +1845,14 @@ bool Item_func_from_unixtime::get_date(M
> void Item_func_convert_tz::fix_length_and_dec()
> {
> collation.set(&my_charset_bin);
> - decimals= 0;
> - max_length= MAX_DATETIME_WIDTH*MY_CHARSET_BIN_MB_MAXLEN;
> + max_length= MAX_DATETIME_WIDTH;
Should probably be:
max_length= MAX_DATETIME_WIDTH*MY_CHARSET_BIN_MB_MAXLEN;
This is how all the other functions are defined.
> + decimals= args[0]->decimals;
> + if (decimals && decimals != NOT_FIXED_DEC)
> + max_length+= min(decimals, MAX_SEC_PART_DIGITS) + 1;
> maybe_null= 1;
> }
I was actually wondering why we don't have an
Item_temporal_func::fix_fields() function that was defined to set the
above as this should be the default for all DATETIME results.
This would allow us to do something like:
bool Item_func_convert_tz::fix_fields(thd, items)
{
decimals= args[0]->decimals;
decimals= min(decimals, MAX_SEC_PART_DIGITS);
return Item_temporal_func::fix_fields(thd, items);
}
And of course Item_temporal_func::fix_fields() could then check if
decimals > MAX_SEC_PART_DIGITS and give an error.
(Just an idea)
<cut>
> @@ -2137,8 +1908,7 @@ void Item_date_add_interval::fix_length_
>
> collation.set(&my_charset_bin);
> maybe_null=1;
> - max_length=MAX_DATETIME_FULL_WIDTH*MY_CHARSET_BIN_MB_MAXLEN;
> + max_length=MAX_DATETIME_WIDTH*MY_CHARSET_BIN_MB_MAXLEN;
Please move setting of the above to where you do:
if (decimals)
max_length+= min(decimals, MAX_SEC_PART_DIGITS) + 1;
So that the result is:
max_length= MAX_DATETIME_WIDTH*MY_CHARSET_BIN_MB_MAXLEN;
if (decimals)
max_length+= min(decimals, MAX_SEC_PART_DIGITS) + 1;
This makes it much easier to ensure that max_length is correct
> @@ -2146,7 +1916,11 @@ void Item_date_add_interval::fix_length_
>
> - If first arg is a MYSQL_TYPE_DATETIME result is MYSQL_TYPE_DATETIME
> - If first arg is a MYSQL_TYPE_DATE and the interval type uses hours,
> + minutes or seconds then type is MYSQL_TYPE_DATETIME
> + otherwise it's MYSQL_TYPE_DATE
> + - if first arg is a MYSQL_TYPE_TIME and the interval type isn't using
> + anything larger than days, then the result is MYSQL_TYPE_TIME,
> + otherwise - MYSQL_TYPE_DATETIME.
> - Otherwise the result is MYSQL_TYPE_STRING
> (This is because you can't know if the string contains a DATE, MYSQL_TIME or
> DATETIME argument)
> @@ -2163,6 +1937,20 @@ void Item_date_add_interval::fix_length_
> else
> cached_field_type= MYSQL_TYPE_DATETIME;
> }
> + else if (arg0_field_type == MYSQL_TYPE_TIME)
> + {
> + if (int_type >= INTERVAL_DAY && int_type != INTERVAL_YEAR_MONTH)
> + cached_field_type= arg0_field_type;
> + else
> + cached_field_type= MYSQL_TYPE_DATETIME;
> + }
> + if (int_type == INTERVAL_MICROSECOND || int_type >= INTERVAL_DAY_MICROSECOND)
> + decimals= 6;
> + else
> + decimals= args[0]->decimals;
Shouldn't we above use:
decimals= min(args[0]->decimals, 6);
I suspect that some of the other code may find it confusing if
decimals have a too high value.
I tried:
select 20010101000203.000000004 + interval 1 day;
and this did die in an assert, which seams to indicate the the above
change is needed.
> + if (decimals)
> + max_length+= min(decimals, MAX_SEC_PART_DIGITS) + 1;
> + value.alloc(max_length);
> }
>
>
> @@ -2172,7 +1960,7 @@ bool Item_date_add_interval::get_date(MY
> {
> INTERVAL interval;
>
> - if (args[0]->get_date(ltime, TIME_NO_ZERO_DATE) ||
> + if (args[0]->get_date(ltime, TIME_NO_ZERO_DATE|TIME_FUZZY_DATE) ||
Please add space around | to make the above readable
> void Item_func_add_time::fix_length_and_dec()
> {
> enum_field_types arg0_field_type;
> - decimals=0;
> - max_length=MAX_DATETIME_FULL_WIDTH*MY_CHARSET_BIN_MB_MAXLEN;
> + max_length= MAX_DATETIME_WIDTH;
Move setting of max_length down before 'if (decimals)' (see earlier comment).
> + decimals= max(args[0]->decimals, args[1]->decimals);
Add:
decimals= min(decimals, MAX_DATETIME_PRECISION);
<cut>
> + MYSQL_TIME copy= *ltime;
> + Lazy_string_time str(©);
> +
> + check_time_range(ltime, decimals, &was_cut);
> + if (was_cut)
> + make_truncated_value_warning(current_thd, MYSQL_ERROR::WARN_LEVEL_WARN,
> + &str, MYSQL_TIMESTAMP_TIME, NullS);
Move setting of Lazy_string_time str(©) inside 'if (was_cut)...'
> +bool Item_func_timediff::get_date(MYSQL_TIME *ltime, uint fuzzy_date)
> {
> DBUG_ASSERT(fixed == 1);
> longlong seconds;
> long microseconds;
> - int l_sign= 1;
> - MYSQL_TIME l_time1 ,l_time2, l_time3;
> + int l_sign= 1, was_cut= 0;
> + MYSQL_TIME l_time1,l_time2,l_time3;
> + Lazy_string_time str(&l_time3);
Move setting of 'str' to where it's used.
> @@ -2899,6 +2535,9 @@ String *Item_func_timediff::val_str(Stri
> l_time1.time_type != l_time2.time_type)
> goto null_date;
>
> + if (fuzzy_date & TIME_NO_ZERO_IN_DATE)
> + goto null_date;
It took some time to find a test to trigger this code.
Please add a comment:
/*
The following test maybe true when we are called by a function that
require a datetime object. For example when doing:
select timediff('...') + interval 1 month
*/
In any case, this should be moved to start of the function as there is
no way this function can ever return anything else than NULL if this
flag is set and there is no reason to calculate any else in this case.
> @@ -2915,16 +2554,22 @@ String *Item_func_timediff::val_str(Stri
> if (l_time1.neg && (seconds || microseconds))
> l_time3.neg= 1-l_time3.neg; // Swap sign of result
>
Add comment here:
/*
Adjust second so that we don't loose information from (long) cast.
This is safe to to as calc_time_from_sec() will overflow also for
INT_MAX32 and we will get the warning we expect
*/
> + set_if_smaller(seconds, INT_MAX32);
> calc_time_from_sec(&l_time3, (long) seconds, microseconds);
>
> + if ((fuzzy_date & TIME_NO_ZERO_DATE) && (seconds == 0) && (microseconds == 0))
> + goto null_date;
TIME_NO_ZERO_DATE means that neither day or month should not be 0
As this is always the case for time_diff() we could merge this test
with the above 'if (fuzzy_date & TIME_NO_ZERO_IN_DATE)'
For some examples where this now goes strange:
dayofyear(timediff('0000-00-01 00:00:05', '0000-00-01 00:00:05')) -> NULL
dayofyear(timediff('0000-00-01 00:00:05', '0000-00-01 00:00:04')) -> 0
dayofyear(cast('0000-00-01 00:00:05' as datetime)) -> NULL
I think it should return NULL in all cases (which is what happens if
you move the TIME_NO_ZERO_DATE test as suggested above).
<cut>>
>
> @@ -3292,36 +2930,45 @@ get_date_time_result_type(const char *fo
> void Item_func_str_to_date::fix_length_and_dec()
> {
> maybe_null= 1;
> - decimals=0;
> cached_field_type= MYSQL_TYPE_DATETIME;
> - max_length= MAX_DATETIME_FULL_WIDTH*MY_CHARSET_BIN_MB_MAXLEN;
> - cached_timestamp_type= MYSQL_TIMESTAMP_NONE;
> + max_length= MAX_DATETIME_WIDTH+MAX_SEC_PART_DIGITS;
Add '*MY_CHARSET_BIN_MB_MAXLEN' for completeness here and in the
other places we set max_length..
<cut>
> +++ sql/item_timefunc.h 2011-03-23 12:31:06 +0000
> @@ -323,147 +323,100 @@ class Item_func_unix_timestamp :public I
> };
>
>
> -class Item_func_time_to_sec :public Item_int_func
> +class Item_func_time_to_sec :public Item_real_func
> {
> public:
> - Item_func_time_to_sec(Item *item) :Item_int_func(item) {}
> - longlong val_int();
> + Item_func_time_to_sec(Item *item) :Item_real_func(item) {}
> const char *func_name() const { return "time_to_sec"; }
> + double val_real();
> void fix_length_and_dec()
> {
> maybe_null= TRUE;
> - decimals=0;
> - max_length=10*MY_CHARSET_BIN_MB_MAXLEN;
> + decimals=args[0]->decimals;
> + max_length=17;
max_length= 17*MY_CHARSET_BIN_MB_MAXLEN;
<cut>
> +class Item_temporal_func: public Item_func
> {
> public:
> + Item_temporal_func() :Item_func() {}
> + Item_temporal_func(Item *a) :Item_func(a) {}
> + Item_temporal_func(Item *a, Item *b) :Item_func(a,b) {}
> + Item_temporal_func(Item *a, Item *b, Item *c) :Item_func(a,b,c) {}
> enum Item_result result_type () const { return STRING_RESULT; }
> + enum_field_types field_type() const { return MYSQL_TYPE_DATETIME; }
> String *val_str(String *str);
> longlong val_int();
> + double val_real();
> + bool get_date(MYSQL_TIME *res, uint fuzzy_date) { DBUG_ASSERT(0); return 1; }
> my_decimal *val_decimal(my_decimal *decimal_value)
> + { return val_decimal_from_date(decimal_value); }
> + Field *tmp_table_field(TABLE *table)
> + { return tmp_table_field_from_field_type(table, 0); }
> int save_in_field(Field *field, bool no_conversions)
> + { return save_date_in_field(field); }
> };
>
See earlier suggestion to add a fix_fields() function and add a
'THD' variable. The only problem with this is to decide upon the
value of 'decimal' for functions that depends on it's arguments for
the value of decimal. One way to do this would bo first call
item->fix_length_and_dec() and then use the value of decimals and
field_type() to calculate max_length.
Having a Item_temporal_func::fix_fields() would also make it trivial
to ensure we have always a legal value for decimals!
We also would not have to do the test:
max_length+= (decimals ? min(decimals, MAX_SEC_PART_DIGITS)+1 : 0);
In a lot of places.
<cut>
> @@ -563,16 +502,10 @@ class Item_func_now_utc :public Item_fun
> class Item_func_sysdate_local :public Item_func_now
Inheriting from Item_temporal_func() may be better.
(See earlier comment)
<cut>
> +class Item_func_sec_to_time :public Item_timefunc
> {
> public:
> + Item_func_sec_to_time(Item *item) :Item_timefunc(item) {}
> + bool get_date(MYSQL_TIME *res, uint fuzzy_date);
> void fix_length_and_dec()
> {
> collation.set(&my_charset_bin);
> maybe_null=1;
> + decimals= args[0]->decimals;
> + if (decimals != NOT_FIXED_DEC && decimals > MAX_SEC_PART_DIGITS)
> + decimals= MAX_SEC_PART_DIGITS;
I am not sure that NOT_FIXED_DEC works for time; It may be better to
always limit this to MAX_SEC_PART_DIGITS.
<cut>
> +class Item_date_typecast :public Item_temporal_typecast
> {
> public:
> - Item_date_typecast(Item *a) :Item_typecast_maybe_null(a) {}
> + Item_date_typecast(Item *a) :Item_temporal_typecast(a) {}
> const char *func_name() const { return "cast_as_date"; }
> - String *val_str(String *str);
> bool get_date(MYSQL_TIME *ltime, uint fuzzy_date);
> - bool get_time(MYSQL_TIME *ltime);
> const char *cast_type() const { return "date"; }
> enum_field_types field_type() const { return MYSQL_TYPE_DATE; }
> - Field *tmp_table_field(TABLE *table)
> - {
> - return tmp_table_field_from_field_type(table, 0);
> - }
> void fix_length_and_dec()
> {
> collation.set(&my_charset_bin);
> + decimals= 0;
You don't need to set decimals to 0; This is already done in the item
constructor. Of course, if we have Item_temporal::fix_fields() this
function would not be needed at all...
> + max_length= MAX_DATE_WIDTH;
> maybe_null= 1;
<cut>
> +class Item_time_typecast :public Item_temporal_typecast
> {
> public:
> - Item_time_typecast(Item *a) :Item_typecast_maybe_null(a) {}
> + Item_time_typecast(Item *a, uint dec_arg)
> + :Item_temporal_typecast(a) { decimals= dec_arg; }
> const char *func_name() const { return "cast_as_time"; }
> + bool get_date(MYSQL_TIME *ltime, uint fuzzy_date);
> const char *cast_type() const { return "time"; }
> enum_field_types field_type() const { return MYSQL_TYPE_TIME; }
> + void fix_length_and_dec()
> {
> + collation.set(&my_charset_bin);
> + maybe_null= 1;
> + max_length= MIN_TIME_WIDTH;
> + if (decimals == NOT_FIXED_DEC)
> + decimals= args[0]->decimals;
You proabaly should test for AUTO_SEC_PART_DIGITS as this is the
define used in sql_yacc.yy
> + if (decimals && decimals != NOT_FIXED_DEC)
> + max_length+= min(decimals, MAX_SEC_PART_DIGITS) + 1;
You need to limit decimals to be <= MAX_SEC_PART_DIGITS or you will get
an assert in some code (see earlier test).
However, I agree that this is nice:
create table t1 (a double);
insert into t1 values(1.0001),(1.000000000001)
select time(a) from t1;
->
| 00:00:01.000100 |
| 00:00:01 |
If I change the create to:
create table t1 (a double(10,10));
the above will cause an assert in:
my_time.c:1047: my_time_to_str: Assertion `digits >= 0 && digits <= 6' failed.
<cut>
> +class Item_func_last_day :public Item_datefunc
> {
> public:
> - Item_func_last_day(Item *a) :Item_date(a) {}
> + Item_func_last_day(Item *a) :Item_datefunc(a) {}
> const char *func_name() const { return "last_day"; }
> bool get_date(MYSQL_TIME *res, uint fuzzy_date);
> + void fix_length_and_dec()
> + {
> + maybe_null=1;
Better if Item_datefunc::fix_length_and_dec() or
Item_temporal_func::fix_fields() sets maybe_null= 1 by default.
> === modified file 'sql/log_event.cc'
In the file you have:
#ifdef MYSQL_CLIENT
...
#ifdef MYSQL_CLIENT
void free_table_map_log_event(...)
#endif
#endif
Please remove the inner #ifdef MYSQL_CLIENT
> @@ -3598,7 +3614,7 @@ bool Start_log_event_v3::write(IO_CACHE*
> int2store(buff + ST_BINLOG_VER_OFFSET,binlog_version);
> memcpy(buff + ST_SERVER_VER_OFFSET,server_version,ST_SERVER_VER_LEN);
> if (!dont_set_created)
> - created= when= get_time();
> + created= get_time();
Add a comment before get_time():
/* get_time sets 'when' and 'when_sec_part' as a side effect */
<cut>
> @@ -1000,16 +1004,27 @@ class Log_event
> { return 0; }
> virtual bool write_data_body(IO_CACHE* file __attribute__((unused)))
> { return 0; }
> - inline time_t get_time()
> + inline my_time_t get_time()
> {
> THD *tmp_thd;
> if (when)
> return when;
> if (thd)
> + {
> + when= thd->start_time;
> + when_sec_part= thd->start_time_sec_part;
> + return when;
> + }
thd should alway be true. Can you at some point in time check when
this is not the case and if needed set thd for the class !
> if ((tmp_thd= current_thd))
current_thd() should never fail. If it can, there should be a comment
in the code about this!
> + {
> + when= tmp_thd->start_time;
> + when_sec_part= tmp_thd->start_time_sec_part;
> + return when;
> + }
> + my_hrtime_t hrtime= my_hrtime();
> + when= hrtime_to_my_time(hrtime);
> + when_sec_part= hrtime_sec_part(hrtime);
> + return when;
> }
> #endif
> virtual Log_event_type get_type_code() = 0;
>
> === modified file 'sql/mysql_priv.h'
<cut>
> +/*
> + to unify the code that differs only in the argument passed to the
> + error message (string vs. number)
> +
> + We pass this container around, and only convert the number
> + to a string when necessary.
> +*/
Good idea, but please consider using a better name for the string.
This is more a value wrapper than an string.
> +class Lazy_string
> +{
> +public:
> + Lazy_string() {}
> + virtual ~Lazy_string() {}
> + virtual void copy(String *str) const = 0;
> +};
> +
<cut>
> +class Lazy_string_dbl: public Lazy_string
dbl -> double
(No reason to save a couple of characters here)
<cut>
> === modified file 'sql/set_var.cc'
> --- sql/set_var.cc 2010-10-20 18:21:40 +0000
> +++ sql/set_var.cc 2011-03-01 12:24:36 +0000
> @@ -2715,38 +2715,38 @@ int set_var_collation_client::update(THD
>
> bool sys_var_timestamp::check(THD *thd, set_var *var)
> {
> + double val= var->value->val_real();
> + if (val < 0 || val > MY_TIME_T_MAX)
> {
> my_message(ER_UNKNOWN_ERROR,
> "This version of MySQL doesn't support dates later than 2038",
> MYF(0));
> return TRUE;
> }
> + var->save_result.ulonglong_value= hrtime_from_time(var->value->val_real());
> return FALSE;
> }
>
>
> bool sys_var_timestamp::update(THD *thd, set_var *var)
> {
> - thd->set_time((time_t) var->save_result.ulonglong_value);
> + my_hrtime_t hrtime = { var->save_result.ulonglong_value };
> + thd->set_time(hrtime);
> return FALSE;
> }
>
>
> void sys_var_timestamp::set_default(THD *thd, enum_var_type type)
> {
> - thd->user_time=0;
> + thd->user_time.val= 0;
> }
>
>
> uchar *sys_var_timestamp::value_ptr(THD *thd, enum_var_type type,
> LEX_STRING *base)
> {
> - thd->sys_var_tmp.long_value= (long) thd->start_time;
> - return (uchar*) &thd->sys_var_tmp.long_value;
> + thd->sys_var_tmp.double_value= thd->start_time + thd->start_time_sec_part/1e6;
> + return (uchar*) &thd->sys_var_tmp.double_value;
> }
>
>
> @@ -3045,10 +3045,10 @@ void sys_var_microseconds::set_default(T
> uchar *sys_var_microseconds::value_ptr(THD *thd, enum_var_type type,
> LEX_STRING *base)
> {
> - thd->tmp_double_value= (double) ((type == OPT_GLOBAL) ?
> + thd->sys_var_tmp.double_value= (double) ((type == OPT_GLOBAL) ?
> global_system_variables.*offset :
> thd->variables.*offset) / 1000000.0;
Change 100000.0 to same define constant as 1e6
>
> === modified file 'sql/sql_class.cc'
> @@ -2338,7 +2338,7 @@ bool select_max_min_finder_subselect::se
> case DECIMAL_RESULT:
> op= &select_max_min_finder_subselect::cmp_decimal;
> break;
> - case ROW_RESULT:
> + default:
Add back ROW_RESULT and add TIME_RESULT
> +++ sql/sql_class.h 2011-03-26 10:59:34 +0000
<cut>
> inline void set_time()
> {
> + if (user_time.val)
> {
> + start_time= hrtime_to_my_time(user_time);
> + start_time_sec_part= hrtime_sec_part(user_time);
> start_utime= utime_after_lock= my_micro_time();
> }
> else
> + {
> + my_hrtime_t hrtime;
> + my_timediff_t timediff;
> + my_micro_and_hrtime(&timediff, &hrtime);
> + start_time= hrtime_to_my_time(hrtime);
> + start_time_sec_part= hrtime_sec_part(hrtime);
> + utime_after_lock= start_utime= timediff.val;
> + }
For the future:
As we are now always calling hrtime(), we should consider using
hrtime also for start_utime. At least on systems with a fast hrtime()
better to use this for both my_micro_time() and hrtime().
> +++ sql/sql_prepare.cc 2011-03-01 12:24:36 +0000
> @@ -487,8 +487,7 @@ static void set_param_time(Item_param *p
> }
> else
> set_zero_time(&tm, MYSQL_TIMESTAMP_TIME);
> - param->set_time(&tm, MYSQL_TIMESTAMP_TIME,
> - MAX_TIME_WIDTH * MY_CHARSET_BIN_MB_MAXLEN);
> + param->set_time(&tm, MYSQL_TIMESTAMP_TIME, MAX_TIME_FULL_WIDTH);
Should be:
MAX_TIME_FULL_WIDTH * MY_CHARSET_BIN_MB_MAXLEN
> === modified file 'sql/sql_select.cc'
> @@ -9455,7 +9445,7 @@ test_if_equality_guarantees_uniqueness(I
> {
> return r->const_item() &&
> /* elements must be compared as dates */
> - (Arg_comparator::can_compare_as_dates(l, r, 0) ||
> + (l->cmp_type() == TIME_RESULT ||
> /* or of the same result type */
> (r->result_type() == l->result_type() &&
> /* and must have the same collation if compared as strings */
Add a comment:
/*
It's enough to test 'l' for TIME_RESULT as only 'l' can be a field
in this context.
*/
> === modified file 'sql/sql_show.cc'
> --- sql/sql_show.cc 2010-11-02 13:20:02 +0000
<cut>
> @@ -6288,6 +6305,8 @@ ST_FIELD_INFO columns_fields_info[]=
> 0, (MY_I_S_MAYBE_NULL | MY_I_S_UNSIGNED), 0, OPEN_FRM_ONLY},
> {"NUMERIC_SCALE", MY_INT64_NUM_DECIMAL_DIGITS , MYSQL_TYPE_LONGLONG,
> 0, (MY_I_S_MAYBE_NULL | MY_I_S_UNSIGNED), 0, OPEN_FRM_ONLY},
> + {"DATETIME_PRECISION", MY_INT64_NUM_DECIMAL_DIGITS, MYSQL_TYPE_LONGLONG,
> + 0, (MY_I_S_MAYBE_NULL | MY_I_S_UNSIGNED), 0, OPEN_FRM_ONLY},
You have added a new field; Please add to compability documentation!
> {"CHARACTER_SET_NAME", MY_CS_NAME_SIZE, MYSQL_TYPE_STRING, 0, 1, 0,
> OPEN_FRM_ONLY},
> {"COLLATION_NAME", MY_CS_NAME_SIZE, MYSQL_TYPE_STRING, 0, 1, "Collation",
>
> +++ sql/sql_table.cc 2011-03-01 12:24:36 +0000
> @@ -34,26 +34,16 @@ const char *primary_key_name="PRIMARY";
>
> static bool check_if_keyname_exists(const char *name,KEY *start, KEY *end);
> static char *make_unique_key_name(const char *field_name,KEY *start,KEY *end);
> -static int copy_data_between_tables(TABLE *from,TABLE *to,
> - List<Create_field> &create, bool ignore,
> - uint order_num, ORDER *order,
> - ha_rows *copied,ha_rows *deleted,
> - enum enum_enable_or_disable keys_onoff,
> - bool error_if_not_empty);
> +static int copy_data_between_tables(TABLE *,TABLE *, List<Create_field> &, bool,
> + uint, ORDER *, ha_rows *,ha_rows *,
> + enum enum_enable_or_disable, bool);
I don't understand why you did this change here and in other places.
I usually prefer to have the field names also in the prototype as it
makes the prototype easier to read!
<cut>
> void make_time(const DATE_TIME_FORMAT *format __attribute__((unused)),
> const MYSQL_TIME *l_time, String *str)
> {
> + uint length= (uint) my_time_to_str(l_time, (char*) str->ptr(), 0);
> str->length(length);
> str->set_charset(&my_charset_bin);
It would be best to do str->alloc(MAX_DATETIME_FULL_WIDTH) at start of
function.
> }
> @@ -713,15 +693,15 @@ void make_date(const DATE_TIME_FORMAT *f
> void make_datetime(const DATE_TIME_FORMAT *format __attribute__((unused)),
> const MYSQL_TIME *l_time, String *str)
> {
> + uint length= (uint) my_datetime_to_str(l_time, (char*) str->ptr(), 0);
It would be best to do str->alloc(MAX_DATETIME_FULL_WIDTH) at start of
function.
> str->length(length);
> str->set_charset(&my_charset_bin);
> }
<cut>
> bool date_add_interval(MYSQL_TIME *ltime, interval_type int_type, INTERVAL interval)
> {
> long period, sign;
>
> + sign= (interval.neg == ltime->neg ? 1 : -1);
Add a comment that shows why the above is needed
>
> switch (int_type) {
> case INTERVAL_SECOND:
> @@ -789,35 +770,29 @@ bool date_add_interval(MYSQL_TIME *ltime
> case INTERVAL_DAY_SECOND:
> case INTERVAL_DAY_MINUTE:
> case INTERVAL_DAY_HOUR:
> + case INTERVAL_DAY:
> {
> + longlong usec, daynr;
> + my_bool neg= ltime->neg;
> + enum enum_mysql_timestamp_type time_type= ltime->time_type;
> +
> + if (ltime->time_type != MYSQL_TIMESTAMP_TIME)
Change to:
if (time_type != MYSQL_TIMESTAMP_TIME)
> + ltime->day+= calc_daynr(ltime->year, ltime->month, 1) - 1;
> +
> + usec= COMBINE(ltime) + sign*COMBINE(&interval);
> +
> + unpack_time(usec, ltime);
> + ltime->time_type= time_type;
> + ltime->neg^= neg;
> +
> + if (time_type == MYSQL_TIMESTAMP_TIME)
> + break;
> +
> + if (int_type != INTERVAL_DAY)
> + ltime->time_type= MYSQL_TIMESTAMP_DATETIME; // Return full date
> +
> + daynr= ltime->day + 32*(ltime->month + 13*ltime->year);
Add a comment for the above:
/*
unpack_time() calculates month and day based on the assumption that
the date was packed with 'pack_time()'. The following code restores
the number of days.
*/
another option would be to do:
daynr= usec / 1000000/24/60/60;
This would at least be easier to understand...
<cut>
Here is some bugs I found:
CREATE TABLE t3 (a time(6));
insert into t3 values(-100);
select * from t3;
->8299:28:36.710656
If you use just (a time) you get: "-00:01:00" which is correct.
cast("2101-00-01 02:03:04" as datetime) -> "2101-00-01 02:03:04"
but
cast(cast("2101-00-01 02:03:04" as datetime) as time) -> NULL
Other things:
Item_func_unix_timestamp() should return microseconds.
Otherwise one can't acccess the full precision of Field_timestamp.
One should also be able to do:
select unix_timestamp(now(6));
For fast access to the current clock in maximum precision.
---------
general_log.timestamp should be of type timestamp(6)
slow_log.query_time and slow_log.lock_time() should be of type TIME(6)
-----------
Regards,
Monty
2
3
Re: [Maria-developers] [Commits] Rev 2992: Removed some alias warnings in lp:maria/5.3
by Kristian Nielsen 19 May '11
by Kristian Nielsen 19 May '11
19 May '11
Michael Widenius <monty(a)askmonty.org> writes:
> ------------------------------------------------------------
> revno: 2992
> revision-id: monty(a)askmonty.org-20110517214756-6f7fv5uk68i4hb3b
> parent: sanja(a)askmonty.org-20110516132109-h2g9qzrfq3g9t6em
> committer: Michael Widenius <monty(a)askmonty.org>
> branch nick: maria-5.3
> timestamp: Wed 2011-05-18 00:47:56 +0300
> message:
> Removed some alias warnings
> Fixed alias bug when compiling with gcc 4.2.4 that caused subselect.test to fail
"Fix strict aliasing bug that caused subselect.test to fail"
("alias bug" can mean different things. And the problem is not GCC specific,
see below)
> === modified file 'sql/item_subselect.cc'
> --- a/sql/item_subselect.cc 2011-05-16 12:07:04 +0000
> +++ b/sql/item_subselect.cc 2011-05-17 21:47:56 +0000
> @@ -3563,7 +3562,15 @@ subselect_single_select_engine::change_r
> }
> else
> result= res;
> - return select_lex->join->change_result(result);
> +
> + /*
> + We can't use 'result' below as gcc 4.2.4's alias optimization
> + assumes that result was not changed by thd->change_item_tree().
> + I tried to find a solution to make gcc happy, but could not find anything
> + that would not require a lot of extra code that would be harder to manage
> + than the current code.
> + */
> + return select_lex->join->change_result(res);
> }
I do not agree with this patch.
Instead do something like this:
@@ -3559,7 +3559,9 @@ subselect_single_select_engine::change_r
nothing special about Item* pointer so it is safe conversion. We do
not change the interface to be compatible with MySQL.
*/
- thd->change_item_tree((Item**) &result, (Item*)res);
+ Item *temp_item;
+ thd->change_item_tree(&temp_item, (Item*)res);
+ result= (select_result_interceptor *)temp_item;
}
else
result= res;
This makes the code correct (at least with respect to standard C++ strict
aliasing), not just work-around the problem for one specific combination of
compiler flags for one specific compiler version.
The problem has nothing to do with any particular GCC version (except that
this is where we happened to detect it), the problem is passing a
select_result_interceptor** when an Item** is needed. This is in violation of
the strict aliasing rule in standard C++.
----
However, I think we urgently need to start using -fno-strict-aliasing (and
probably -fno-strict-overflow as well). Davi told me MySQL@Oracle is doing
this as well.
In effect there are two dialects of C/C++, even though they are rarely
identified as such.
- "System C/C++", which is a high-level assembler, where everything is bytes
and pointers are just 8 (or 4) byte integers.
- "Application C/C++", which is a type-safe language, and pointers are
abstract objects that cannot be arbitrarily manipulated.
Clearly, many (most?) developers are programming in "System C/C++", including
you. That is fine, Postgress, the Linux Kernel, they use the same dialect.
However, GCC by default uses "Application C/C++" (-fstrict-aliasing). We need
to tell GCC which dialict of C/C++ we are using (-fno-strict-aliasing for
"System C/C++").
If we could just agree on which dialect we use, and let GCC know our choice,
things will be much better.
- Kristian.
2
1
Re: [Maria-developers] [Commits] Rev 2994: Original idea from Zardosht Kasheff to add HA_CLUSTERED_INDEX in lp:maria/5.3
by Sergei Golubchik 19 May '11
by Sergei Golubchik 19 May '11
19 May '11
Hi, Michael!
On May 18, Michael Widenius wrote:
> === modified file 'sql/handler.h'
> --- a/sql/handler.h 2011-05-16 11:05:45 +0000
> +++ b/sql/handler.h 2011-05-18 16:26:30 +0000
> @@ -161,8 +161,11 @@
> */
> #define HA_KEY_SCAN_NOT_ROR 128
> #define HA_DO_INDEX_COND_PUSHDOWN 256 /* Supports Index Condition Pushdown */
> -
> -
> +/*
> + Data is clustered on this key. This means that when you read the key
> + you also get the row data in the same block.
perhaps, better to say that "you get the row data without an extra seek
(or even without an extra disk read)".
A question: what about memory-based (and SSD) engines? Could/Should one
consider all indexes clustered?
> +*/
> +#define HA_CLUSTERED_INDEX 512
>
> /*
> bits in alter_table_flags:
> @@ -2311,9 +2314,22 @@ class handler :public Sql_alloc
>
>
> /*
> - @retval TRUE Primary key (if there is one) is clustered
> - key covering all fields
> - @retval FALSE otherwise
> + Check if the primary key (if there is one) is a clustered key covering
> + all fields. This means:
I wouldn't say that it "checks if the primary key is clustered", but
something like "historical method that returns TRUE if everything below
holds:"
> +
> + - Data is stored together with the primary key (no secondary lookup
> + needed to find the row data). The optimizer uses this to find out
> + the cost of fetching data.
> + - The primary key is part of each secondary key and is used
> + to find the row data in the primary index when reading trough
> + secondary indexes.
> + - When doing a HA_KEYREAD_ONLY we get also all the primary key parts
> + into the row. This is critical property used by index_merge.
and here "all the above is usually true for engines that store the row
data in the primary key index (e.g. in a b-tree), and use the primary
key value as a position()"
> +
> + For a clustered primary key, index_flags() returns also HA_CLUSTERED_INDEX
> +
> + @retval TRUE yes
> + @retval FALSE No.
> */
> virtual bool primary_key_is_clustered() { return FALSE; }
> virtual int cmp_ref(const uchar *ref1, const uchar *ref2)
>
> === modified file 'sql/sql_select.cc'
> --- a/sql/sql_select.cc 2011-05-16 12:07:04 +0000
> +++ b/sql/sql_select.cc 2011-05-18 16:26:30 +0000
> @@ -16525,9 +16527,9 @@ test_if_skip_sort_order(JOIN_TAB *tab,OR
> */
> DBUG_ASSERT (ref_key != (int) nr);
>
> - bool is_covering= table->covering_keys.is_set(nr) ||
> - (nr == table->s->primary_key &&
> - table->file->primary_key_is_clustered());
> + bool is_covering= (table->covering_keys.is_set(nr) ||
> + (table->file->index_flags(nr, 0, 1) &
> + HA_CLUSTERED_INDEX));
See, that's what I said. It make sense to set covering_keys bitmap
for all clustered indexes once, when the table is opened, and then you
not only get this condition simpler, but probably will also cover other
cases where table->covering_keys is checked (or may be checked in the
future).
> /*
> Don't use an index scan with ORDER BY without limit.
> === modified file 'sql/sql_table.cc'
> --- a/sql/sql_table.cc 2011-05-16 11:05:45 +0000
> +++ b/sql/sql_table.cc 2011-05-18 16:26:30 +0000
> @@ -7983,7 +7983,8 @@ copy_data_between_tables(TABLE *from,TAB
>
> if (order)
> {
> - if (to->s->primary_key != MAX_KEY && to->file->primary_key_is_clustered())
> + if (to->s->primary_key != MAX_KEY &&
> + to->file->ha_table_flags() & HA_TABLE_SCAN_ON_INDEX)
Hmm, good.
It took me a while to come up with a realistic use case where
to->file->primary_key_is_clustered() is false, but
to->file->ha_table_flags() & HA_TABLE_SCAN_ON_INDEX is true.
Example: new InnoDB optimization that makes it to use internal generated
primary key when user creates a PK of, say, VARCHAR(255). That is, in
this optimization InnoDB treat this key as UNIQUE but not PRIMARY.
Not completely unrealistic idea. In this case
to->s->primary_key != MAX_KEY is true, there is a primary key.
to->file->primary_key_is_clustered() is false, but
to->file->ha_table_flags() & HA_TABLE_SCAN_ON_INDEX is true.
Heh, this is a good example, perhaps we need to apply it as a test to
other code that uses primary_key_is_clustered() and see if the condition
is used correctly.
> {
> char warn_buff[MYSQL_ERRMSG_SIZE];
> my_snprintf(warn_buff, sizeof(warn_buff),
>
Regards,
Sergei
2
2