Re: [Maria-developers] [Commits] a74e3ef17e7: MDEV-14551 Can't find record in table on multi-table update with ORDER BY
Hi Serg, Please find some cosmetic input below. (additionally, we've found a failing example while chatting) On Tue, Apr 10, 2018 at 02:10:17PM +0200, serg@mariadb.org wrote:
revision-id: a74e3ef17e7a8693d68001290a8d91b5b206804d (mariadb-10.3.5-138-ga74e3ef17e7) parent(s): 9bd3af97dffa411657879bbdfd4dfef38c99f7cc author: Sergei Golubchik committer: Sergei Golubchik timestamp: 2018-04-10 14:09:17 +0200 message:
MDEV-14551 Can't find record in table on multi-table update with ORDER BY
...
diff --git a/mysql-test/main/multi_update.test b/mysql-test/main/multi_update.test index 5feebe87a5a..96e78dab82b 100644 --- a/mysql-test/main/multi_update.test +++ b/mysql-test/main/multi_update.test @@ -914,3 +914,14 @@ update t1 set c1=NULL; update t1, t2 set t1.c1=t2.c3 where t1.c3=t2.c3 order by t1.c3 desc limit 2; select * from t1; drop table t1, t2; + +# +# MDEV-14551 Can't find record in table on multi-table update with ORDER BY +# +CREATE TABLE t1 (i INT) ENGINE=MEMORY; +INSERT t1 VALUES (1),(2); +CREATE TABLE t2 (f INT) ENGINE=MyISAM; +INSERT t2 VALUES (1),(2); +UPDATE t1, t2 SET f = 126 ORDER BY f LIMIT 2; +SELECT * FROM t2; +DROP TABLE t1, t2; diff --git a/sql/item.h b/sql/item.h index 9574bdc63bf..e391e7810c4 100644 --- a/sql/item.h +++ b/sql/item.h @@ -2023,6 +2023,7 @@ class Item: public Value_source, { marker &= ~EXTRACTION_MASK; } + virtual TABLE *rowid_table() const { return 0; }
I am concerned about this approach bloating class Item's vtable. There is (and likely will ever be) only one class with a different implementation, the check for this is made only in one place - why not check Item's type() or functype(), etc?
};
MEM_ROOT *get_thd_memroot(THD *thd); diff --git a/sql/item_strfunc.cc b/sql/item_strfunc.cc index c6b0ae7fba8..9097ffca867 100644 --- a/sql/item_strfunc.cc +++ b/sql/item_strfunc.cc @@ -5204,3 +5204,23 @@ String *Item_func_dyncol_list::val_str(String *str) my_free(names); return NULL; } + +Item_temptable_rowid::Item_temptable_rowid(TABLE *table_arg) + : Item_str_func(table_arg->in_use), table(table_arg) +{ + max_length= table->file->ref_length; +} + +void Item_temptable_rowid::fix_length_and_dec() +{ + used_tables_cache= table->map; + const_item_cache= false; +} + +String *Item_temptable_rowid::val_str(String *str) +{ + if (!((null_value= table->null_row))) + table->file->position(table->record[0]); + str_value.set((char*)(table->file->ref), max_length, &my_charset_bin); + return &str_value; +} diff --git a/sql/item_strfunc.h b/sql/item_strfunc.h index 18cda491efd..49de5568696 100644 --- a/sql/item_strfunc.h +++ b/sql/item_strfunc.h @@ -1748,5 +1748,20 @@ class Item_func_dyncol_list: public Item_str_func { return get_item_copy<Item_func_dyncol_list>(thd, this); } };
-#endif /* ITEM_STRFUNC_INCLUDED */
Please add a note that this is not the "_rowid" that we support in the parser.
+class Item_temptable_rowid :public Item_str_func +{ + TABLE *table; +public: + Item_temptable_rowid(TABLE *table_arg); + const Type_handler *type_handler() const { return &type_handler_string; } + Field *create_tmp_field(bool group, TABLE *table) + { return create_table_field_from_handler(table); } + String *val_str(String *str); + const char *func_name() const { return "<rowid>"; } + void fix_length_and_dec(); + Item *get_copy(THD *thd) + { return get_item_copy<Item_temptable_rowid>(thd, this); } + TABLE *rowid_table() const { return table; } +};
+#endif /* ITEM_STRFUNC_INCLUDED */ ...
diff --git a/sql/sql_select.cc b/sql/sql_select.cc index 796ea569e64..22301319680 100644 --- a/sql/sql_select.cc +++ b/sql/sql_select.cc @@ -2650,6 +2650,21 @@ bool JOIN::add_having_as_table_cond(JOIN_TAB *tab) }
+bool JOIN::add_fields_for_current_rowid(JOIN_TAB *cur, List<Item> *table_fields)
I think ths is actually "current rowids", in plural. As the loop shows, there are multiple current rowids (from different tables).
+{ + for (JOIN_TAB *tab=join_tab; tab < cur; tab++)
Please add a comment: this will not walk into semi-join materialization nests but this is ok because we will never need to save current rowids for those.
+ { + if (!tab->keep_current_rowid) + continue; + Item *item= new (thd->mem_root) Item_temptable_rowid(tab->table); + item->fix_fields(thd, 0); + table_fields->push_back(item, thd->mem_root); + cur->tmp_table_param->func_count++; + } + return 0; +} + + /** Set info for aggregation tables
diff --git a/sql/sql_select.h b/sql/sql_select.h index f8911fbba01..1da87bb9d50 100644 --- a/sql/sql_select.h +++ b/sql/sql_select.h @@ -1438,6 +1438,9 @@ class JOIN :public Sql_alloc
enum { QEP_NOT_PRESENT_YET, QEP_AVAILABLE, QEP_DELETED} have_query_plan;
+ // if keep_current_rowid=true, whether they should be saved in temporary table + bool tmp_table_keep_current_rowid; Same input as with add_fields_for_current_rowid - use plural.
+ /* Additional WHERE and HAVING predicates to be considered for IN=>EXISTS subquery transformation of a JOIN object.
diff --git a/sql/sql_union.cc b/sql/sql_union.cc index 0149c2848c2..0e29f817e56 100644 --- a/sql/sql_union.cc +++ b/sql/sql_union.cc @@ -507,14 +507,14 @@ void select_union_recursive::cleanup() bool select_union_direct::change_result(select_result *new_result) { result= new_result; - return (result->prepare(unit->types, unit) || result->prepare2()); + return (result->prepare(unit->types, unit) || result->prepare2(0)); Do we really continue to use '0' when 'NULL' should be used? I prefer "NULL".
}
bool select_union_direct::postponed_prepare(List<Item> &types) { if (result != NULL) - return (result->prepare(types, unit) || result->prepare2()); + return (result->prepare(types, unit) || result->prepare2(0)); else return false; } diff --git a/sql/sql_update.cc b/sql/sql_update.cc index 38638d3aa1d..4ae7e33b4e3 100644 --- a/sql/sql_update.cc +++ b/sql/sql_update.cc @@ -2205,10 +2195,41 @@ multi_update::initialize_tables(JOIN *join) DBUG_RETURN(1); tmp_tables[cnt]->file->extra(HA_EXTRA_WRITE_CACHE); } + join->tmp_table_keep_current_rowid= TRUE; DBUG_RETURN(0); }
(Follwing explanations on Slack) Please add a comment there, saying that Here we walk through the aggegation temptable to find where rowids are stored. And then for each of the multi-update tables, we get them to copy the data from the rowid field in the aggregation temptable.
+ +int multi_update::prepare2(JOIN *join) +{ + if (!join->need_tmp || !join->tmp_table_keep_current_rowid) + return 0; + + // there cannot be many tmp tables in multi-update + JOIN_TAB *tmptab= join->join_tab + join->exec_join_tab_cnt(); + + for (Item **it= tmptab->tmp_table_param->items_to_copy; *it ; it++) + { + TABLE *tbl= (*it)->rowid_table(); + if (!tbl) + continue; + for (uint i= 0; i < table_count; i++) + { + for (Item **it2= tmp_table_param[i].items_to_copy; *it2; it2++) + { + if ((*it2)->rowid_table() != tbl) + continue; + *it2= new (thd->mem_root) Item_field(thd, (*it)->get_tmp_table_field()); + if (!*it2) + return 1; + } + } + } + return 0; +} + + multi_update::~multi_update() { TABLE_LIST *table;
BR Sergei -- Sergei Petrunia, Software Developer MariaDB Corporation | Skype: sergefp | Blog: http://s.petrunia.net/blog
Hi, Sergey! On Apr 12, Sergey Petrunia wrote:
diff --git a/sql/item_strfunc.h b/sql/item_strfunc.h index 18cda491efd..49de5568696 100644 --- a/sql/item_strfunc.h +++ b/sql/item_strfunc.h @@ -1748,5 +1748,20 @@ class Item_func_dyncol_list: public Item_str_func { return get_item_copy<Item_func_dyncol_list>(thd, this); } };
-#endif /* ITEM_STRFUNC_INCLUDED */
Please add a note that this is not the "_rowid" that we support in the parser.
I used the terminology that you introduced, that is "keep_current_rowid". But if you don't mind, I'd rather rename keep_current_rowid (and every "rowid" that I introduced) to keep_current_position (because it's handler::position()) or keep_current_ref (because it's handler::ref and handler::ref_length), whichever is less ambiguous. Regards, Sergei Chief Architect MariaDB and security@mariadb.org
On Thu, Apr 12, 2018 at 07:22:56PM +0200, Sergei Golubchik wrote:
Hi, Sergey!
On Apr 12, Sergey Petrunia wrote:
diff --git a/sql/item_strfunc.h b/sql/item_strfunc.h index 18cda491efd..49de5568696 100644 --- a/sql/item_strfunc.h +++ b/sql/item_strfunc.h @@ -1748,5 +1748,20 @@ class Item_func_dyncol_list: public Item_str_func { return get_item_copy<Item_func_dyncol_list>(thd, this); } };
-#endif /* ITEM_STRFUNC_INCLUDED */
Please add a note that this is not the "_rowid" that we support in the parser.
I used the terminology that you introduced, that is "keep_current_rowid".
But if you don't mind, I'd rather rename keep_current_rowid (and every "rowid" that I introduced) to keep_current_position (because it's handler::position()) or keep_current_ref (because it's handler::ref and handler::ref_length), whichever is less ambiguous.
I didn't want to request any renames actually. We use the term "rowid" in many places in the code. Until now, there has been no intersection with the SQL-level "_rowid". Now, with the new Item-derived class I was afraid there would be some confusion and wanted a comment to prevent it, that's all. BR Sergei -- Sergei Petrunia, Software Developer MariaDB Corporation | Skype: sergefp | Blog: http://s.petrunia.net/blog
Hi, Sergey! On Apr 13, Sergey Petrunia wrote:
Please add a note that this is not the "_rowid" that we support in the parser.
I used the terminology that you introduced, that is "keep_current_rowid".
But if you don't mind, I'd rather rename keep_current_rowid (and every "rowid" that I introduced) to keep_current_position (because it's handler::position()) or keep_current_ref (because it's handler::ref and handler::ref_length), whichever is less ambiguous.
I didn't want to request any renames actually. We use the term "rowid" in many places in the code. Until now, there has been no intersection with the SQL-level "_rowid". Now, with the new Item-derived class I was afraid there would be some confusion and wanted a comment to prevent it, that's all.
Okay, you didn't want to request it. But would you mind if I introduce a consistent terminology and rename rowid to position where applicable? "ref" is ambiguous, rowid is apparently ambiguous too. position was there forever, since 2000 at least, and it's fairly unambiguous, so I'd rather use it consistently everywhere. Regards, Sergei Chief Architect MariaDB and security@mariadb.org
Hi, Sergey! Forgot to mention it in my first reply, sorry On Apr 12, Sergey Petrunia wrote:
diff --git a/sql/item.h b/sql/item.h index 9574bdc63bf..e391e7810c4 100644 --- a/sql/item.h +++ b/sql/item.h @@ -2023,6 +2023,7 @@ class Item: public Value_source, { marker &= ~EXTRACTION_MASK; } + virtual TABLE *rowid_table() const { return 0; }
I am concerned about this approach bloating class Item's vtable.
There is (and likely will ever be) only one class with a different implementation, the check for this is made only in one place - why not check Item's type() or functype(), etc?
I know, I don't like this either. But the alternative was if (item->type() == FUNC_ITEM && ((Item_func*)item)->functype() == ROWID_FUNC && ((Item_temptable_rowid*)item)->table == tbl) ... that is, I need to compare both item->type() and item_func->functype(). So I thought it'd be better to add a method. I can revert back to type()/functype() comparison, if you prefer that. Regards, Sergei Chief Architect MariaDB and security@mariadb.org
On Thu, Apr 12, 2018 at 07:45:37PM +0200, Sergei Golubchik wrote:
Hi, Sergey!
Forgot to mention it in my first reply, sorry
On Apr 12, Sergey Petrunia wrote:
diff --git a/sql/item.h b/sql/item.h index 9574bdc63bf..e391e7810c4 100644 --- a/sql/item.h +++ b/sql/item.h @@ -2023,6 +2023,7 @@ class Item: public Value_source, { marker &= ~EXTRACTION_MASK; } + virtual TABLE *rowid_table() const { return 0; }
I am concerned about this approach bloating class Item's vtable.
There is (and likely will ever be) only one class with a different implementation, the check for this is made only in one place - why not check Item's type() or functype(), etc?
I know, I don't like this either. But the alternative was
if (item->type() == FUNC_ITEM && ((Item_func*)item)->functype() == ROWID_FUNC && ((Item_temptable_rowid*)item)->table == tbl) ...
that is, I need to compare both item->type() and item_func->functype(). So I thought it'd be better to add a method.
I can revert back to type()/functype() comparison, if you prefer that.
I would prefer that, if several checks are needed a short static/inline function will be more readable... BR Sergei -- Sergei Petrunia, Software Developer MariaDB Corporation | Skype: sergefp | Blog: http://s.petrunia.net/blog
participants (2)
-
Sergei Golubchik
-
Sergey Petrunia