Hi, Monty! On Apr 18, Michael Widenius wrote:
diff --git a/mysql-test/main/rownum.test b/mysql-test/main/rownum.test new file mode 100644 index 00000000000..8d89eed2f86 --- /dev/null +++ b/mysql-test/main/rownum.test @@ -0,0 +1,542 @@ +--source include/have_sequence.inc +# +# Test of basic rownum() functionallity +# Test are done with Aria to ensure that row order is stable
what does that mean?
It means that SELECT * will return the rows in the same order for the test. (There is normally no guaranteed that insert and select order are the same).
But both MyISAM and InnoDB guarantee that too. What's so special with Aria?
+ OPEN c_cursor; + FETCH c_cursor INTO v_a, v_b, v_rn; + WHILE c_cursor%FOUND LOOP + SELECT concat(v_a,'--',v_b,'--',v_rn); + FETCH c_cursor INTO v_a, v_b, v_rn; + END LOOP; + CLOSE c_cursor; +END;| +DELIMITER ;| + +--error ER_FUNCTION_CANNOT_BE_USED_IN_CLAUSE +select a, rownum from t1 group by a, rownum having rownum < 3; +select a, rownum from t1 group by a, rownum having "rownum" < 3;
what's the difference between these two? Why the first is an error and the second is not?
There are quotes around rownum in the second version. In Oracle mode that is an error, in MariaDB mode it is not.
I can see the quotes, that's what I asked about. Why *in Oracle mode* the first is an error and the second is not? I'd expect them to be identical.
+select a, rownum as r from t1 group by a, rownum having r < 3;
And the this one. Why is it not an error either?
Because 'r' is a reference, not a function. 'r' works on the result set. rownum cannot be executed in the having clause (which is executed after all rows are returned).
I don't understand it. HAVING is not executed *after* all rows are returned, it's executed *while* they're returned. What is the difference between rownum directly in HAVING and referring to it by an alias?
index 53c236bbce7..c237fcd9ecc 100644
--- a/sql/item_func.h +++ b/sql/item_func.h @@ -2125,6 +2125,63 @@ class Item_func_rand :public Item_real_func };
+class Item_func_rownum :public Item_longlong_func +{ + /* + This points to a variable that contains the number of rows + accpted so far in the result set + */ + ha_rows *accepted_rows; + SELECT_LEX *select; +public: + Item_func_rownum(THD *thd); + longlong val_int() override; + LEX_CSTRING func_name_cstring() const override + { + static LEX_CSTRING name= {STRING_WITH_LEN("rownum") }; + return name; + } + enum Functype functype() const override { return ROWNUM_FUNC; } + void update_used_tables() override {} + bool const_item() const override { return 0; } + void fix_after_optimize(THD *thd) override; + bool fix_fields(THD* thd, Item **ref) override; + bool fix_length_and_dec() override + { + unsigned_flag= 1; + used_tables_cache= RAND_TABLE_BIT; + const_item_cache=0; + set_maybe_null();
how can rownum be NULL?
In context where there are no rows. like SET @A=rownum;
Do you have test cases for that?
+ return FALSE; + } + void cleanup() override + { + Item_longlong_func::cleanup(); + /* Ensure we don't point to freed memory */ + accepted_rows= 0; + } + bool check_vcol_func_processor(void *arg) override + { + return mark_unsupported_function(func_name(), "()", arg, + VCOL_IMPOSSIBLE); + } + bool check_handler_func_processor(void *arg) override + { + return mark_unsupported_function(func_name(), "()", arg, + VCOL_IMPOSSIBLE);
first, as I wrote above, I think that there's no logical reason why rownum should not work with handler.
Please read the patch. It cannot be in the handler for multiple reasons: - INSERT, UPDATE, LOAD DATA - handler does not know if the row is accepted - joins etc.
I mean the HANDLER statement, that's why my comment is just below check_handler_func_processor() method.
but even if it doesn't, it's much better to check for SQLCOM_HA_READ from Item_func_rownum::fix_fields instead of traversing the whole item tree for every single HANDLER READ command looking for Item_func_rownum that nobody (literally, not a single user out of all our millions) will use there.
There is only a traverse of the tree looking for rownum if the rownum function is used.
Why do you think there is some extra travelling of the tree for HANDLER_READ ?
Because, see above, this comment refers to check_handler_func_processor() method this is called on every item by walrking the item tree. As there is only one item weird enough to possibly need it, I'd rather do the check in the item directly.
+ } + Item *get_copy(THD *thd) override + { + return this;
It looks wrong. We don't have any other Item that returns `this` here. I think rownum cannot be pushed down because of RAND_TABLE_BIT, So better return NULL here, meaning "cannot be pushed down". Instead of `this` meaning "if you push it down, it'll surreptitiously start corrupting the memory"
I don't think it can trash memory even if it would be pushed down. As long as the item exists, it should be safe to use anywhere. I also don't trust the code to be able to cope with returning null in get_copy.
Quite the contrary. There are other items that return NULL from get_copy(). It's expected and normal, the code is designed to support it, it was part of the original implementation. No get_copy() returns this, the code is not designed to handle it.
+ } + /* This function is used in insert, update and delete */ + void store_pointer_to_row_counter(ha_rows *row_counter)
So this isn't a virtual method, and to use it you downcast after checking type() == FUNC_ITEM and functype() == ROWNUM_FUNC. To avoid adding a new virtual method thus blowing up the vtable, I presume.
Yes, it is not a virtual and should not be. And yes, we do not want to add any new virtual functions to items when we have close to 600+ items in the system...
It is also much faster to do an if on upper level than doing a function call.
And fix_after_optimize() is virtual. To avoid error-prone downcasting, would be my guess.
Yes. This is something we may need for other items in the future, which is why I made it virtaul.
Do you have in mind any other items that will need it any time soon? If not, then I agree with what you wrote above, we do not want to add any new virtual functions to items when we have close to 600+ items in the system.
--- a/sql/item_func.cc +++ b/sql/item_func.cc @@ -7206,3 +7206,72 @@ void Item_func_setval::print(String *str, enum_query_type query_type) str->append_ulonglong(round); str->append(')'); } + + +/* + Return how many row combinations has accepted so far + 1 + + The + 1 is to ensure that 'WHERE ROWNUM <=1' returns one row
this is a very weird explanation, it sounds as if it's a heuristical a hack for ROWNUM <=1 to work
Why not to write "+1 takes into account the currently computed row"
Because the row is not yet computed or accepted. We can call rownum any time, after a row is accpted, before etc.
I personally prefer the current comment as it is more well defined.
So, you mean, +1 is not needed for WHERE ROWNUM <=2 to return two rows? Or for WHERE ROWNUM<=100 to return 100 rows? You pick one very specific case as an explanation, this implicitly assumes it's the main reason. Which it is not, even if we'll only consider exactly the same condition template, there are still 18446744073709551614 other reasons :) +1 is, exactly, to take into account the current row, the one being processed right now. By the way, instead of +1 here, you could simply initialize your accepted_rows with 1, not with 0.
+*/ + +longlong Item_func_rownum::val_int() +{ + if (!accepted_rows) + { + /* + Rownum is not properly set up. Probably used in wrong context when + it should not be used. In this case returning 0 is probably the best + solution. + */
DBUG_ASSERT(0); here?
No. One can use rownum in a lot of context where it does makes sence. Instead of trying to generate an error for every of these cases, I decided to return 0 for these. In the future maybe some of these will be valid and then things can change. I do not want to see asserts in test cases people add for things like this.
Do you have test cases for that?
+ select->with_rownum= 1; + thd->lex->with_rownum= 1; + thd->lex->uncacheable(UNCACHEABLE_RAND);
Why? ROWNUM doesn't make a query uncacheable, it's perfectly deterministic and repeatable.
Not necessartly. Row order may be different from the engine on subsequent reads for example.
This is the same with LIMIT. Even if the row order might be different, the first result is still correct, so it can be cached and reused. LIMIT doesn't make a query uncacheable.
+ + /* If this command changes data, mark it as unsafe for statement logging */ + if (sql_command_flags[thd->lex->sql_command] & + (CF_UPDATES_DATA | CF_DELETES_DATA)) + thd->lex->set_stmt_unsafe(LEX::BINLOG_STMT_UNSAFE_SYSTEM_FUNCTION);
Why? It's the same as LIMIT. unsafe without ORDER BY and perfectly safe with.
It is much worse. rownum is not safe to use anytime, not even with ORDER BY etc. (Except maybe with INSERT).. So better to give an error if it is used in any context.
example?
Anyway, a better comment is of course good:
/* If rownum() is used we have to preserve the insert row order to make GROUP BY and ORDER BY with filesort work.
SELECT * from (SELECT a,b from t1 ORDER BY a)) WHERE rownum <= 0;
When rownum is not used the optimizer will skip the ORDER BY clause. With rownum we have to keep the ORDER BY as this is what is expected. We also have to create any sort result temporary table in such a way that the inserted row order is maintained. */
Very good, thanks!
if (derived->is_recursive_with_table() && !derived->is_with_table_recursive_reference() && diff --git a/sql/sql_insert.cc b/sql/sql_insert.cc index 806049b4465..5c4dff33914 100644 --- a/sql/sql_insert.cc +++ b/sql/sql_insert.cc @@ -746,6 +746,9 @@ bool mysql_insert(THD *thd, TABLE_LIST *table_list, DBUG_RETURN(TRUE); } + + if (thd->lex->with_rownum && table_list->lock_type == TL_WRITE_DELAYED) + table_list->lock_type= TL_WRITE;
why? do you have a test case when it's needed?
Because we cannot calculate row number inside insert delayed... (We do not know if an row is accepted or not).
"Not accepted"? The only possibility for a row to be not accepted is INSERT IGNORE and a unique key violation, right? WHERE clause is done before the delayed thread. I don't see any tests for INSERT IGNORE in your patch, so I cannot say what the intended behavior should be. Could you please add a test for it? For example, create table t1 (a int not null primary key, b int); insert ignore t1 values (1,rownum()),(2,rownum()),(2,rownum()),(3,rownum());
@@ -2133,8 +2141,14 @@ int write_record(THD *thd, TABLE *table, COPY_INFO *info, select_result *sink) autoinc values (generated inside the handler::ha_write()) and values updated in ON DUPLICATE KEY UPDATE. */ - if (sink && sink->send_data(thd->lex->returning()->item_list) < 0) - trg_error= 1; + if (sink) + { + /* We have to adjust 'copied' to make ROWNUM() work in RETURNING */ + info->copied--; + if (sink->send_data(thd->lex->returning()->item_list) < 0) + trg_error= 1; + info->copied++;
this is quite silly. Can you simply increment info->copied after this sink->send_data call?
No, as this leads to wrong resutls. The problem is that for 'returning' the row is already accepted while in all other code it is called before the row is accepted.
Sorry, I don't understand. Why it would lead to wrong results?
+ }
after_trg_or_ignored_err: if (key) diff --git a/sql/sql_limit.h b/sql/sql_limit.h index a4fcedac14a..62fd0546af0 100644 --- a/sql/sql_limit.h +++ b/sql/sql_limit.h @@ -67,6 +67,15 @@ class Select_limit_counters { return select_limit_cnt; } ha_rows get_offset_limit() { return offset_limit_cnt; } + + void set_limit_if_lower(ha_rows limit) + { + if (!offset_limit_cnt)
I suppose, alternatively, you could've had
if (offset_limit_cnt + select_limit_cnt > limit) select_limit_cnt= limit - offset_limit_cnt;
+ { + if (select_limit_cnt > limit) + select_limit_cnt= limit; + } + } };
In some cases LIMIT is absolute. With rownum it only applies if limit is higher. I think it is good to separate those tests.
What do you mean "LIMIT is absolute"? I only suggested that LIMIT can take ROWNUM into account also when OFFSET is non-zero. In your case you only adjust LIMIT if there is no OFFSET. It's not a particularly common case, so not a very important optimization, of course. On the other hand, the suggested change is quite simple.
#endif // INCLUDES_MARIADB_SQL_LIMIT_H diff --git a/sql/sql_yacc.yy b/sql/sql_yacc.yy index 2bd2575b882..57ba9df42c0 100644 --- a/sql/sql_yacc.yy +++ b/sql/sql_yacc.yy @@ -10286,6 +10287,22 @@ function_call_nonkeyword: if (unlikely($$ == NULL)) MYSQL_YYABORT; } +/* Start SQL_MODE_ORACLE_SPECIFIC + | ROWNUM_SYM optional_braces + { + $$= new (thd->mem_root) Item_func_rownum(thd); + if (unlikely($$ == NULL)) + MYSQL_YYABORT; + } +End SQL_MODE_ORACLE_SPECIFIC */ +/* Start SQL_MODE_DEFAULT_SPECIFIC */ + | ROWNUM_SYM '(' ')' + { + $$= new (thd->mem_root) Item_func_rownum(thd); + if (unlikely($$ == NULL)) + MYSQL_YYABORT; + } +/* End SQL_MODE_DEFAULT_SPECIFIC */
You have two nearly identical rules compiled conditionally. I can pretty much guarantee that they'll go out of sync eventually.
Better do something like
| ROWNUM_SYM /* Start SQL_MODE_ORACLE_SPECIFIC optional_braces End SQL_MODE_ORACLE_SPECIFIC */ /* Start SQL_MODE_DEFAULT_SPECIFIC */ '(' ')' /* End SQL_MODE_DEFAULT_SPECIFIC */ { $$= new (thd->mem_root) Item_func_rownum(thd); if (unlikely($$ == NULL)) MYSQL_YYABORT; }
A little better, but not much. Still not readable.
This whole work of removing sql_yacc_ora.yy and using comments for parts of the grammar that differ - this was done *exactly* to avoid two nearly identical grammars that always needs to be modified in sync. Which is very error-prone. By literally copying a part of the grammar you're reintroducing the old problem of two copies. Not completely, of course, by bit by bit we'll have two copies to maintain again. Please, either use the "little better" grammar that I suggested or rewrite it any way you want that doesn't involve two identical copies of the same rule.
<cut>
diff --git a/sql/sql_select.cc b/sql/sql_select.cc @@ -14210,7 +14236,24 @@ static ORDER * remove_const(JOIN *join,ORDER *first_order, COND *cond, bool change_list, bool *simple_order) { - *simple_order= join->rollup.state == ROLLUP::STATE_NONE; + /* + We can't do ORDER BY using filesort if the select list contains a non + deterministic value like RAND() or ROWNUM(). + For example: + SELECT RAND() as 'r' from t1 order by r;
This one uses temporary for me. Can you put here a correct example please?
It is a correct example and probably the best possible example. I do not understand your comment (does not make any sense). Could you please provide a correct comment?
I tried this in the 10.5 branch: MariaDB [test]> explain select rand() as r from mysql.user order by r\G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: user type: index possible_keys: NULL key: PRIMARY key_len: 228 ref: NULL rows: 3 Extra: Using index; Using temporary; Using filesort 1 row in set (0.002 sec) as you see, it uses temporary, so there is no need to do your + *simple_order= !join->rand_table_in_field_list; because there is no bug here. At least for the example that you've used. This is why I asked for an example of a query that indeed behaves incorrectly without this fix of yours.
+static bool is_const(Item *item) +{ + /* + Constant expression in WHERE can be also subquery or other expensive + thing we shoud not evaluate on OPTIMIZATION phase + */ + return item->const_item() && !item->is_expensive(); +} + + +// Check if this is an unexpensive functional expression over basic constants + +static bool is_basic_const(Item *item) +{ + if (item->basic_const_item()) + return true; + if (!item->const_item() || item->is_expensive()) + return false; + if (item->type() == Item::FUNC_ITEM) + { + Item_func *func= ((Item_func *)item); + if (func->argument_count() == 0) + return false; + bool result= true; + Item **args= func->arguments(); + for (uint i= 0; i < func->argument_count() && result; i++) + result= result && is_basic_const(args[i]); + return result; + } + return false; +}
is_basic_const() is a misleading name, a function of basic consts isn't a basic const itself.
This is how bar has named the item and classes. Do a grep after basic_const..
I know, that was not my question. See below
what's an example of an expression has is_const() == true, but is_basic_const() == false?
"1+1" This is the sum of two basic constants.
Yes, this is an example of the item, where item->const_item() == true and item->basic_const() is false. This is not what I asked. I asked for an example, where is_const(item) is true and is_basic_const(item) is false. The question was not about old items, but about your new code. Because your is_basic_const() is not just item->basic_const(), so I asked for an example, to understand a difference.
+{ + return (constant->basic_const_item() && + rownum->type() == Item::FUNC_ITEM && + ((Item_func*) rownum)->functype() == Item_func::ROWNUM_FUNC); +} + +
+ return false; +} + + +static bool permanent_limit(THD *thd, SELECT_LEX_UNIT *unit, ha_rows lim) +{ + SELECT_LEX *gpar= unit->global_parameters(); + if (gpar->select_limit != 0 && + // limit can not be an expression but can be parameter + (!gpar->select_limit->basic_const_item() || + ((ha_rows)gpar->select_limit->val_int()) < lim)) + return false; + + Query_arena *arena, backup; + arena= thd->activate_stmt_arena_if_needed(&backup); + + gpar->select_limit= + new (thd->mem_root)Item_int(thd, lim, 20 /* max 64 bit length*/ );
we have defines for that, MAX_BIGINT_WIDTH, MY_INT64_NUM_DECIMAL_DIGITS, LONGLONG_BUFFER_SIZE - whatever you prefer.
This come from Sanja, but I can change it. (I also updated the comments to be "more english")
Thanks
+/** + Check posibility of LIMIT setting by rownum() of upper SELECT and do it + + @note Ideal is to convert something like + SELECT ... + FROM (SELECT ...) table + WHERE rownum() < <CONSTANT>; + to + SELECT ... + FROM (SELECT ... LIMIT <CONSTANT>) table + WHERE rownum() < <CONSTANT>;
1. what if the outer select uses LIMIT, not ROWNUM? why not to push that down too?
We already have code for that.
But then you don't need to push ROWNUM into a subquery. You have code to convert ROWNUM into a LIMIT, and you just said that we already have code to push LIMIT. So, you do not need to push ROWNUM specifically, it'll happen automatically, won't it?
2. why are you not removing rownum from the outer select?
Because it may not be safe to do in some cases. Better safe than sorry..
if you remove rownum and reset with_rownum=false, it'll generally give optimizer more freedom.
Has no effect on our optimizer in this particular case.
I'm not sure about it. ROWNUM cannot be pushed down into a subquery. LIMIT can be, as you said, so if you convert a ROWNUM into a LIMIT and remove ROWNUM from WHERE, then the whole WHERE can be pushed into a subquery, and a LIMIT can be too. So SELECT * FROM (SELECT col1, col2 FROM tables WHERE cond ORDER BY something) WHERE ROWNUM <=100 AND col1 = 10; can first be changed to SELECT * FROM (SELECT col1, col2 FROM tables WHERE cond ORDER BY something) WHERE col1 = 10 LIMIT 100; and then to SELECT * FROM (SELECT col1, col2 FROM tables WHERE cond AND col1 = 10 ORDER BY something LIMIT 100); and for that you don't even need any special code for pushing ROWNUM into a subquery as a limit, you only need to convert ROWNUM to a LIMIT on the same level of select.
+static bool check_rownum_usage(Item_func *func_item, longlong *limit, + bool *inv_order, const_probe probe) +{ + Item *arg1, *arg2; + *inv_order= 0; + DBUG_ASSERT(func_item->argument_count() == 2); + + /* 'rownum op const' or 'const op field' */ + arg1= func_item->arguments()[0]->real_item(); + if (arg1->type() == Item::FUNC_ITEM && + ((Item_func*) arg1)->functype() == Item_func::ROWNUM_FUNC) + { + arg2= func_item->arguments()[1]->real_item(); + if ((*probe)(arg2)) + { + *limit= arg2->val_int(); + return *limit <= 0 || (ulonglong) *limit >= HA_POS_ERROR; + } + } + else if ((*probe)(arg1)) + { + arg2= func_item->arguments()[1]->real_item();
why not to assign arg2 outside of the if() ?
Because we should not evaluate it if probe is not true. It can be expensive to evalute it or it can have side effects that we may want to avoid. For example, it could be a sub query and probe is there to ensure we don't evaluate it here.
Sure, but func_item->arguments()[1]->real_item() is not evaluating anything. I only mean arg2= line, that's identical in both branches. I didn't mean doing val_int() unconditionally.
+ if (arg2->type() == Item::FUNC_ITEM && + ((Item_func*) arg2)->functype() == Item_func::ROWNUM_FUNC) + { + *limit= arg1->val_int(); + *inv_order= 1; + return *limit <= 0 || (ulonglong) *limit >= HA_POS_ERROR; + } + } + return 1; +} + + +/* + Limit optimization for ROWNUM() + + Go through the WHERE clause and find out if there are any of the following + constructs on the top level: + rownum() <= integer_constant + rownum() < integer_constant + rownum() = 1 + + If yes, then threat the select as if 'LIMIT integer_constant' would + have been used +*/ + +static void optimize_rownum(THD *thd, SELECT_LEX_UNIT *unit, Item *cond) +{ + DBUG_ENTER("optimize_rownum"); + + if (cond->type() == Item::COND_ITEM) + { + if (((Item_cond*) cond)->functype() == Item_func::COND_AND_FUNC) + { + List_iterator<Item> li(*((Item_cond*) cond)->argument_list()); + Item *item; + while ((item= li++)) + optimize_rownum(thd, unit, item); + } + DBUG_VOID_RETURN; + } + + process_direct_rownum_comparison(thd, unit, cond, &is_const, + &temporary_set_limit);
please, explain why it's is_const/temporary_set_limit here and is_basic_const/permanent_limit in optimize_upper_rownum_func().
Please ask Sanja. I think the short version is that const/temporary is for the current select why is_basic_const/permanent is for pushing limit from an outer select to an inner select and there is more restrictions for that.
Sanja said that it's your code and if it were up to him he'd do both permanent. Regards, Sergei VP of MariaDB Server Engineering and security@mariadb.org