Hi, Michael! On May 04, Michael Widenius wrote:
On Mon, Apr 19, 2021 at 10:52 PM Sergei Golubchik <serg@mariadb.org> wrote:
On Apr 18, Michael Widenius wrote:
diff --git a/mysql-test/main/rownum.test b/mysql-test/main/rownum.test new file mode 100644 index 00000000000..8d89eed2f86 --- /dev/null +++ b/mysql-test/main/rownum.test @@ -0,0 +1,542 @@ +--source include/have_sequence.inc +# +# Test of basic rownum() functionallity +# Test are done with Aria to ensure that row order is stable
what does that mean?
It means that SELECT * will return the rows in the same order for the test. (There is normally no guaranteed that insert and select order are the same).
But both MyISAM and InnoDB guarantee that too. What's so special with Aria?
MyISAM does guarantee that, but not InnoDB (returns rows in primary key order) and Aria with page storage will not do it either.
Aria with page row format doesn't. But InnoDB does, primary key order is stable - it's not in the order of insertion, like with MyISAM, but it is stable, always the same.
+ OPEN c_cursor; + FETCH c_cursor INTO v_a, v_b, v_rn; + WHILE c_cursor%FOUND LOOP + SELECT concat(v_a,'--',v_b,'--',v_rn); + FETCH c_cursor INTO v_a, v_b, v_rn; + END LOOP; + CLOSE c_cursor; +END;| +DELIMITER ;| + +--error ER_FUNCTION_CANNOT_BE_USED_IN_CLAUSE +select a, rownum from t1 group by a, rownum having rownum < 3; +select a, rownum from t1 group by a, rownum having "rownum" < 3;
Why *in Oracle mode* the first is an error and the second is not? I'd expect them to be identical.
In Oracle you cannot use rownum function in having. You can however, thanks to MariaDB, refer to a column in the select part.
+select a, rownum as r from t1 group by a, rownum having r < 3;
And the this one. Why is it not an error either?
Because 'r' is a reference, not a function. 'r' works on the result set. rownum cannot be executed in the having clause (which is executed after all rows are returned).
I don't understand it. HAVING is not executed *after* all rows are returned, it's executed *while* they're returned. What is the difference between rownum directly in HAVING and referring to it by an alias?
having rownum < 0 is NOT using an alias.
You can not use a row function on a group result. The group does NOT have a rownum. Each group consist of a set of rows each having it's own rownum. Which rownum should the rownum usage in having get?
To put things simple, we are doing exactly what Oracle is doing, which is the purpose of this patch.
This made so little sense to me even after your explanations, that I've tried it on Oracle. The result was *exactly* the opposite. select a, rownum from t1 group by a, rownum having rownum < 3; this worked just fine and produced the correct result. The other two didn't work, because, so it seems, Oracle doesn't allow to use aliases from the SELECT list in HAVING, the error was select a, rownum as r from t1 group by a,rownum having r < 3; --> ORA-00904: "R": invalid identifier But MariaDB allows it, so in MariaDB all three selects must work.
+ bool check_handler_func_processor(void *arg) override + { + return mark_unsupported_function(func_name(), "()", arg, + VCOL_IMPOSSIBLE);
first, as I wrote above, I think that there's no logical reason why rownum should not work with handler.
Oracle does not support HANDLER, so there is no point in even trying. Also with handler, you have full control of the rows that you are getting, so there is no extra benefit of having rownum.
I know. But you've added a special check, a virtual function, a recursive item tree traversal - all to disable a rownum usage that is not very useful and doesn't exist in Oracle, but still perfectly and consistently fits into MariaDB SQL logic. Did you verify that rownum doesn't "just work" in HANDLER? Even if it doesn't, it'd be easier and more consistent to convert it to LIMIT.
but even if it doesn't, it's much better to check for SQLCOM_HA_READ from Item_func_rownum::fix_fields instead of traversing the whole item tree for every single HANDLER READ command looking for Item_func_rownum that nobody (literally, not a single user out of all our millions) will use there.
Generally, I don't think items should have a list of command where they cannot be used. Normally it is the caller that should do the checking.
Yes. But normally and generally all items work in HANDLER, there's no any logical reason why an item wouldn't. You created rownum as a very weird item that breaks all the common sense rules (because it does in Oracle, I know). I sincerely hope this will never become "normal" and "general", so a check inside Item_rownum is very appropriate here. An even better approach would be to remove the check and just let it work in HANDLER, see above.
+ } + Item *get_copy(THD *thd) override + { + return this;
It looks wrong. We don't have any other Item that returns `this` here. I think rownum cannot be pushed down because of RAND_TABLE_BIT, So better return NULL here, meaning "cannot be pushed down". Instead of `this` meaning "if you push it down, it'll surreptitiously start corrupting the memory"
I don't think it can trash memory even if it would be pushed down. As long as the item exists, it should be safe to use anywhere. I also don't trust the code to be able to cope with returning null in get_copy.
Quite the contrary. There are other items that return NULL from get_copy(). It's expected and normal, the code is designed to support it, it was part of the original implementation. No get_copy() returns this, the code is not designed to handle it.
Actually, originally we had it in many places and it did work and thus we supported it. There is no reason to assume that this would not work in this case. The reason this works is that Item_rownum does not change it null state between copies and does not store it's value in itself. That is the critera for an Item that can return itself.
Monty, I know that get_copy was created to do one of the two things: return *a copy of the item* or *return NULL*. This is how it was originally implemented, by design, and this is how it always worked. Frankly, I don't understand what you're arguing about. get_copy is very well defined - it returns a copy or NULL. You break it and say that "there is no reason to assume that it would not work" in this specific case of Item_rownum? What does it even mean? Item_rownum cannot be copied anyway, it has RAND_TABLE_BIT, so get_copy won't be even called. The only effect of this is that some poor soul can copy your get_copy() and break the server.
--- a/sql/item_func.cc +++ b/sql/item_func.cc @@ -7206,3 +7206,72 @@ void Item_func_setval::print(String *str, enum_query_type query_type) str->append_ulonglong(round); str->append(')'); } + + +/* + Return how many row combinations has accepted so far + 1 + + The + 1 is to ensure that 'WHERE ROWNUM <=1' returns one row
this is a very weird explanation, it sounds as if it's a heuristical a hack for ROWNUM <=1 to work
Why not to write "+1 takes into account the currently computed row"
Because the row is not yet computed or accepted. We can call rownum any time, after a row is accpted, before etc.
I personally prefer the current comment as it is more well defined.
So, you mean, +1 is not needed for WHERE ROWNUM <=2 to return two rows? Or for WHERE ROWNUM<=100 to return 100 rows?
You pick one very specific case as an explanation, this implicitly assumes it's the main reason. Which it is not, even if we'll only consider exactly the same condition template, there are still 18446744073709551614 other reasons :)
Sorry, I neither understand your reasoning or logic here. The comment is correct and explains what is going on in a clear manner. The <= 1 is just an example. You can replace 1 with any number and 'one' with same number verbatim and the comment still works.
if it's an example, the comment should say that. "to ensure that, for example, ..." or "to ensure that queries like ..." or something to explain that `WHERE ROWNUM <=1` is not your single only special case that needs +1. If you'd have "for example" there, I wouldn't have been confused and wouldn't have to write about it.
+1 is, exactly, to take into account the current row, the one being processed right now.
By the way, instead of +1 here, you could simply initialize your accepted_rows with 1, not with 0.
Which would be even harder to explain. Why would we have one accepted row before we even run the query?
If you start with 1, then accepted_rows will mean "the number of the row that will be returned next". Currently it means "number of rows returned so far". Both definitions are consistent and easy to explain.
diff --git a/sql/sql_insert.cc b/sql/sql_insert.cc index 806049b4465..5c4dff33914 100644 --- a/sql/sql_insert.cc +++ b/sql/sql_insert.cc @@ -746,6 +746,9 @@ bool mysql_insert(THD *thd, TABLE_LIST *table_list, DBUG_RETURN(TRUE); } + + if (thd->lex->with_rownum && table_list->lock_type == TL_WRITE_DELAYED) + table_list->lock_type= TL_WRITE;
why? do you have a test case when it's needed?
Because we cannot calculate row number inside insert delayed... (We do not know if an row is accepted or not).
"Not accepted"? The only possibility for a row to be not accepted is INSERT IGNORE and a unique key violation, right? WHERE clause is done before the delayed thread.
I don't see any tests for INSERT IGNORE in your patch, so I cannot say what the intended behavior should be. Could you please add a test for it? For example,
create table t1 (a int not null primary key, b int); insert ignore t1 values (1,rownum()),(2,rownum()),(2,rownum()),(3,rownum());
Add edit, and it worked perfectly.
Did you mean "Added"? No, it didn't work perfectly, it worked inconsistently. See, values (1,rownum),(2,rownum),(2,rownum),(3,rownum) is a table value constructor. You use it as a source to insert rows into a table, ignoring duplicate errors. One can argue, whether a rownum result should be 1,2,3 or 1,2,4, but here's a similar example: create table t1 (a int); insert t1 values (1),(2),(2),(3); create table t2 (x int primary key, y int); insert ignore t2 select a,rownum from t1; select * from t2; +---+------+ | x | y | +---+------+ | 1 | 1 | | 2 | 2 | | 3 | 4 | +---+------+ here it clearly shows that rownum applies to the rows of the source result set that will be inserted, not to what was actually inserted. That is insert ignore t2 values (1,rownum),(2,rownum),(2,rownum),(3,rownum); should also result in +---+------+ | x | y | +---+------+ | 1 | 1 | | 2 | 2 | | 3 | 4 | +---+------+ And from that it follows that INSERT DELAYED can perfectly work with ROWNUM, there can be no problem here. And, by the way, rownum doesn't work in table value constructors now: MariaDB [test]> values (1,rownum),(2,rownum); +---+--------+ | 1 | rownum | +---+--------+ | 1 | 0 | | 2 | 0 | +---+--------+ 2 rows in set (0.001 sec)
--- a/sql/sql_limit.h +++ b/sql/sql_limit.h @@ -67,6 +67,15 @@ class Select_limit_counters { return select_limit_cnt; } ha_rows get_offset_limit() { return offset_limit_cnt; } + + void set_limit_if_lower(ha_rows limit) + { + if (!offset_limit_cnt)
I suppose, alternatively, you could've had
if (offset_limit_cnt + select_limit_cnt > limit) select_limit_cnt= limit - offset_limit_cnt;
+ { + if (select_limit_cnt > limit) + select_limit_cnt= limit; + } + } };
In some cases LIMIT is absolute. With rownum it only applies if limit is higher. I think it is good to separate those tests.
What do you mean "LIMIT is absolute"? I only suggested that LIMIT can take ROWNUM into account also when OFFSET is non-zero. In your case you only adjust LIMIT if there is no OFFSET.
In practice, very few queries has both rownum and limit, so I don't think it is worth the effort to spend too much time on this.
Yes, I certainly agree with that.
diff --git a/sql/sql_yacc.yy b/sql/sql_yacc.yy index 2bd2575b882..57ba9df42c0 100644 --- a/sql/sql_yacc.yy +++ b/sql/sql_yacc.yy @@ -10286,6 +10287,22 @@ function_call_nonkeyword: if (unlikely($$ == NULL)) MYSQL_YYABORT; } +/* Start SQL_MODE_ORACLE_SPECIFIC + | ROWNUM_SYM optional_braces + { + $$= new (thd->mem_root) Item_func_rownum(thd); + if (unlikely($$ == NULL)) + MYSQL_YYABORT; + } +End SQL_MODE_ORACLE_SPECIFIC */ +/* Start SQL_MODE_DEFAULT_SPECIFIC */ + | ROWNUM_SYM '(' ')' + { + $$= new (thd->mem_root) Item_func_rownum(thd); + if (unlikely($$ == NULL)) + MYSQL_YYABORT; + } +/* End SQL_MODE_DEFAULT_SPECIFIC */
You have two nearly identical rules compiled conditionally. I can pretty much guarantee that they'll go out of sync eventually.
Better do something like
| ROWNUM_SYM /* Start SQL_MODE_ORACLE_SPECIFIC optional_braces End SQL_MODE_ORACLE_SPECIFIC */ /* Start SQL_MODE_DEFAULT_SPECIFIC */ '(' ')' /* End SQL_MODE_DEFAULT_SPECIFIC */ { $$= new (thd->mem_root) Item_func_rownum(thd); if (unlikely($$ == NULL)) MYSQL_YYABORT; }
A little better, but not much. Still not readable.
This whole work of removing sql_yacc_ora.yy and using comments for parts of the grammar that differ - this was done *exactly* to avoid two nearly identical grammars that always needs to be modified in sync. Which is very error-prone.
By literally copying a part of the grammar you're reintroducing the old problem of two copies. Not completely, of course, by bit by bit we'll have two copies to maintain again.
Please, either use the "little better" grammar that I suggested or rewrite it any way you want that doesn't involve two identical copies of the same rule.
There is now way they can go out of sync as we have test cases to detect this. (I don't find the new suggested code readable).
It doesn't matter, you've added your perl script specifically to make this code readable. So no matter what the original yy file will be, your perl-processed files will always be as readable as you wanted them. Anyway, I don't want to waste time on arguing that code duplication should be avoided. If you leave two identical code blocks, someone will someday merge them to reduce code duplication.
diff --git a/sql/sql_select.cc b/sql/sql_select.cc @@ -14210,7 +14236,24 @@ static ORDER * remove_const(JOIN *join,ORDER *first_order, COND *cond, bool change_list, bool *simple_order) { - *simple_order= join->rollup.state == ROLLUP::STATE_NONE; + /* + We can't do ORDER BY using filesort if the select list contains a non + deterministic value like RAND() or ROWNUM(). + For example: + SELECT RAND() as 'r' from t1 order by r;
This one uses temporary for me. Can you put here a correct example please?
It is a correct example and probably the best possible example. I do not understand your comment (does not make any sense). Could you please provide a correct comment?
I tried this in the 10.5 branch:
MariaDB [test]> explain select rand() as r from mysql.user order by r\G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: user type: index possible_keys: NULL key: PRIMARY key_len: 228 ref: NULL rows: 3 Extra: Using index; Using temporary; Using filesort 1 row in set (0.002 sec)
as you see, it uses temporary, so there is no need to do your
+ *simple_order= !join->rand_table_in_field_list;
because there is no bug here. At least for the example that you've used. This is why I asked for an example of a query that indeed behaves incorrectly without this fix of yours.
The purpose of this function is to set simple_order. If we do not have my line, it will be set always false, which is what you are seeing. However always having 'use temporary' for any filesort is of course not a good idea.
Here is an example that shows this if I remove my line.
MariaDB [test]> explain select a from t1 order by a; +------+-------------+-------+------+---------------+------+---------+------+------+---------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+-------+------+---------------+------+---------+------+------+---------------------------------+ | 1 | SIMPLE | t1 | ALL | NULL | NULL | NULL | NULL | 4 | Using temporary; Using filesort | +------+-------------+-------+------+---------------+------+---------+------+------+---------------------------------+
There is no reason to do a temporary table for this query
how is it relevant? It doesn't use RAND() and you added a line with join->rand_table_in_field_list. Could you *please* provide an example of a query where your line makes a difference?
+static bool is_const(Item *item) +{ + /* + Constant expression in WHERE can be also subquery or other expensive + thing we shoud not evaluate on OPTIMIZATION phase + */ + return item->const_item() && !item->is_expensive(); +} + + +// Check if this is an unexpensive functional expression over basic constants + +static bool is_basic_const(Item *item) +{ + if (item->basic_const_item()) + return true; + if (!item->const_item() || item->is_expensive()) + return false; + if (item->type() == Item::FUNC_ITEM) + { + Item_func *func= ((Item_func *)item); + if (func->argument_count() == 0) + return false; + bool result= true; + Item **args= func->arguments(); + for (uint i= 0; i < func->argument_count() && result; i++) + result= result && is_basic_const(args[i]); + return result; + } + return false; +}
is_basic_const() is a misleading name, a function of basic consts isn't a basic const itself.
This is how bar has named the item and classes. Do a grep after basic_const..
I know, that was not my question. See below
what's an example of an expression has is_const() == true, but is_basic_const() == false?
"1+1" This is the sum of two basic constants.
Yes, this is an example of the item, where item->const_item() == true and item->basic_const() is false. This is not what I asked.
I asked for an example, where is_const(item) is true and is_basic_const(item) is false. The question was not about old items, but about your new code. Because your is_basic_const() is not just item->basic_const(), so I asked for an example, to understand a difference.
You have to ask Sanja (this is his code). Anyway, in some part the answer to your quesiton is the same as asking when Item::basic_const_item() is true and Item::is_const() is true, as the functions are using this as a base for their return value.
I understand that when Item::basic_const_item() is true, then is_basic_const() will be true. I want to understand cases when Item::basic_const_item() is not true, and is_basic_const() is not true but is_const() is true. See below.
The function comments explains in quite a detal what is tested.
The main point with is_basic_const() is to detect functions where all arguments are basic_consts()
like rownum < 1+2
This is true for is_basic_const.
Yes, of course. I saw that. This is a case when Item::basic_const_item() is false, but is_basic_const() is true. is_basic_const(item) is true, if item is an expression, with all arguments being Item::basic_const_item(). I'm asking, when an item is const_item(), but is not an expression with all arguments being Item::basic_const_item()?
2. why are you not removing rownum from the outer select?
Because it may not be safe to do in some cases. Better safe than sorry..
if you remove rownum and reset with_rownum=false, it'll generally give optimizer more freedom.
Has no effect on our optimizer in this particular case.
I'm not sure about it. ROWNUM cannot be pushed down into a subquery. LIMIT can be, as you said, so if you convert a ROWNUM into a LIMIT and remove ROWNUM from WHERE, then the whole WHERE can be pushed into a subquery, and a LIMIT can be too. So
SELECT * FROM (SELECT col1, col2 FROM tables WHERE cond ORDER BY something) WHERE ROWNUM <=100 AND col1 = 10;
can first be changed to
SELECT * FROM (SELECT col1, col2 FROM tables WHERE cond ORDER BY something) WHERE col1 = 10 LIMIT 100;
SELECT * FROM (SELECT col1, col2 FROM tables WHERE cond AND col1 = 10 ORDER BY something LIMIT 100);
and for that you don't even need any special code for pushing ROWNUM into a subquery as a limit, you only need to convert ROWNUM to a LIMIT on the same level of select.
That would not work if there would be an order by on the upper level or there would be a join involved.
You cannot convert rownum to a limit on the same level. The only time you can convert rownum to a limit is when you push it down to single inner select, which is what the current code is doing.
of course. for any optimization I can say "but it cannot be used anymore if I add this or that". It's not a reason not to support it. For example, we have an optimization to convert outer join to an inner join, even though not every outer join can be converted to an inner join.
I understand what you are trying to suggest. I did take a look at our current code and almost all of it would still be needed to handle the conversion of some rownum's to LIMIT and to handle the general case.
I thought the above could be done very easily - just convert ROWNUM to LIMIT on the same level, when possible. It's a small change in one place. And then just let LIMIT optimisations do their job.
+static void optimize_rownum(THD *thd, SELECT_LEX_UNIT *unit, Item *cond) +{ + DBUG_ENTER("optimize_rownum"); + + if (cond->type() == Item::COND_ITEM) + { + if (((Item_cond*) cond)->functype() == Item_func::COND_AND_FUNC) + { + List_iterator<Item> li(*((Item_cond*) cond)->argument_list()); + Item *item; + while ((item= li++)) + optimize_rownum(thd, unit, item); + } + DBUG_VOID_RETURN; + } + + process_direct_rownum_comparison(thd, unit, cond, &is_const, + &temporary_set_limit);
please, explain why it's is_const/temporary_set_limit here and is_basic_const/permanent_limit in optimize_upper_rownum_func().
Please ask Sanja. I think the short version is that const/temporary is for the current select why is_basic_const/permanent is for pushing limit from an outer select to an inner select and there is more restrictions for that.
Sanja said that it's your code and if it were up to him he'd do both permanent.
All code related to pushing down rownum is Sanjas.
In the above function, the loop is mine, but the call to process_direct_rownum_comparison(thd, unit, cond, &is_const, &temporary_set_limit);
Is Sanjas.
The whole idea of temporary set limit was his idea. If I remember correctly, he did only want to have LIMIT pushdown shown in explain extend for one case, but not for another.
okay, then if neither you nor Sanja knows, I'll try to analyze later myself and see what I'll find. Regards, Sergei