Hi Alexander,
-----Message d'origine----- De : Alexander Barkov [mailto:bar@mariadb.org] Envoyé : jeudi 9 novembre 2017 14:52 À : jerome brauge Cc : 'MariaDB Developers (maria-developers@lists.launchpad.net)' Objet : Re: MDEV-13418 Compatibility: The order of evaluation of SELECT..INTO assignments
On 11/09/2017 03:03 PM, jerome brauge wrote:
-----Message d'origine----- De : Alexander Barkov [mailto:bar@mariadb.org] Envoyé : jeudi 9 novembre 2017 11:05 À : jerome brauge Cc : 'MariaDB Developers (maria-developers@lists.launchpad.net)' Objet : Re: MDEV-13418 Compatibility: The order of evaluation of SELECT..INTO assignments
Hello Jerome,
On 11/09/2017 12:03 PM, jerome brauge wrote:
Hello Alexander,
-----Message d'origine----- De : Alexander Barkov [mailto:bar@mariadb.org] Envoyé : mardi 7 novembre 2017 16:34 À : jerome brauge Cc : 'MariaDB Developers (maria-developers@lists.launchpad.net)' Objet : Re: MDEV-13418 Compatibility: The order of evaluation of SELECT..INTO assignments
Hi Jerome,
On 11/06/2017 01:01 PM, jerome brauge wrote:
Hello Alexander, We have checked all our stored procedure and the good news for us is that only 2 statements are concerned ! So, MDEV-13418 is not mandatory for us.
However, can we consider to do a full parse just at compile time ? It would allow to : - determining variables which must have a temporary pair and modifying stored source code to handle them - issuing error or warning when these statements have syntax error or call unknown functions ( like Oracle, SQLserver , UDB , .. )
I'm not sure I understood about full parser at compile time. Can you please give an example?
Currently, the following function is compiled without any warning and run fine as long as b1 is not between 1 and 5.
delimiter / CREATE or replace procedure p1(b1 integer) BEGIN declare res INTEGER default 0; if b1 = 0 then -- unknow variable set res=x; end if; if b1 = 1 then -- unknow function in set set res=lenngth('v'); end if; if b1 = 2 then -- unknow variable select a into res; end if; if b1 = 3 then -- unknow table / column / function select 1 into res from unknowtable where unknowcolumn = unknowfunc(); end if; if b1 = 4 then -- unknow group by column select 1 into res from dual group by 13; end if; if b1 = 5 then -- unknow function in test if coalesc(res,'x') = 'y' then set res:='z'; end if; end if;
END / call p1(-1) /
My idea is to have different behavior during "create procedure" and load_routine. Load_routine must be fast but "create procedure" can be slower and do more check (at least check existence of variables, tables, columns, functions and procedure) It's not truly acceptable to throw this kind of error only when the code is really executed.
At CREATE time, we can only add a warning when an unknown procedure or function is called. Issuing errors is not possible, because one would not be able to create two routines mutually executing each other (now it is possible).
I agree, a warning is fine for an undefined called function. But is there any chance to have control on used variables and columns ?
In this example:
DELIMITER $$ CREATE OR REPLACE PROCEDURE p1() BEGIN DECLARE res INT DEFAULT 0; SET res=x; END; $$ DELIMITER ;
I think we should generate the error at CREATE time, in all sql_mode's.
Looks like a bug that we don't do it.
Fine.
For things like:
SELECT 1 INTO var FROM unknowtable where unknowcolumn = 10;
We cannot return errors by default. This behavior has been in MySQL/MariaDB since the time when stored procedures appeared: tables and columns resolution is done at execute time.
But adding a new sql_mode to return errors in such cases should be possible. The solution should handle cases when CREATE PROCEDURE was done with a known column, but then the column was removed by a ALTER TABLE. So it seems both CREATE and load must do this. I'm not sure how it can affect performance.
I'm not sure that we need handle these cases during load. It's the responsibility of the developer to do an impact analysis when he modify objects (tables or procedure/function). An error at runtime is acceptable (Sybase and SQLserver works like this).
So issuing errors is possible at execution time only.
When a routine is loaded, it could be checked for all called procedures and functions, and a warning or an error could be issued if some routine is missing (independently on conditions in control
structures).
To do this, each select list expression have to be "parse" and so we can
determine variables which must have a temporary pair.
Ex: CREATE PROCEDURE p1(res OUT VARCHAR) AS b1 INT:=10; BEGIN SELECT 1,CAST(b1 AS VARCHAR(10)) INTO b1, res FROM dual; END;
Can be transform and store as CREATE PROCEDURE p1(res OUT VARCHAR) AS b1 INT:=10; BEGIN DECLARE _b1 INTEGER DEFAULT b1; SELECT 1,CAST(b1 AS VARCHAR(10)) INTO b1, res FROM dual; SET b1=_b1; END;
So, no additional works should be done during load_routine.
Now we always put the routine into mysql.proc AS IS, how the definer defined it.
I'm not sure if we should rewrite CREATE statements...
Without this second point we are exposed to keystrokes errors and to find
these mistakes only at runtime.
It's already very hard to have tests that cover most of the code, but here we must have 100% of coverage ! Fortunately for us, SQLServer and Oracle does the work for us :-)
Other idea : maybe the behavior of "set" could be conditioned by another sql_mode. Sybase , SQLServer, Mariadb : multiple "set" are evaluated as multiple single "set" DB2 : work as select (each expression is evaluated before the assignments are performed)
We could have three sql_mode: - SET_CONSISTENCY_UPDATE - SET_CONSISTENCY_SELECT - SET_CONSISTENCY_SET
What would SET_CONSISTENCY_SELECT stand for? Can you give an example?
I thought to: - SET_CONSISTENCY_UPDATE for MDEV-13417 : UPDATE produces
wrong
values
if an updated column is later used as an update source - SET_CONSISTENCY_SELECT for MDEV-13418 : The order of evaluation of SELECT..INTO assignments - SET_CONSISTENCY_SET : evaluate expression before any assignment in "grouped" SET
Thanks!
I thought that behavior of SET is directly related to behavior of SELECT INTO, as both save data into variables. Why have two different sql_mode options for SET and SELECT INTO?
Btw, why not have just one option SET_CONSISTENCY, which will control all three (UPDATE, SELECT INTO, SET) ?
I didn't understand why you propose separate flags. Can you clarify please?
For UPDATE , there is no workaround, we must implement it for Oracle compatibility.
For SELECT INTO : I've always thinking that it's a bad practice to use and assign the same variable in one statement. When reading the code, we can't know what the developer meant to do. It's not hard to correct this kind of statement. Introduce an overhead for most of select statement (we can optimize it when select return only one value), is not acceptable for us and we don't want enable this feature.
For SET, most of RDBMS (that I know) works as Mariadb : - Sybase and SQLServer - and implicitly ORACLE because it can only do single assignment at one time Others like IBM DB2 have same behavior for multi set and select into.
So it's not a required feature for Oracle compatibility.
We cannot waste sql_mode bits for small features. We'll run out of the second 32 bit slot quickly.
Well. Is it complex to add a new longlong to extend sql_mode? (perhaps for replication) Do you think that we could commit a first step of SET_CONSISTENCY just for the UPDATE part ?
In this case, I'm inclined to have a single SET_CONSISTENCY bit for all three purposes (UPDATE, SELECT INTO, SET). We should just make sure that SELECT INTO (and SET) works with a good performance.
Multi-SET is less important than SELECT INTO, as multi-SET is neither standard, nor Oracle-compatible.
Thanks!
Regards, Jérôme.
> -----Message d'origine----- > De : jerome brauge > Envoyé : vendredi 3 novembre 2017 12:18 À : 'Alexander Barkov' > Cc : MariaDB Developers (maria-developers@lists.launchpad.net) > Objet : RE: MDEV-13418 Compatibility: The order of evaluation of > SELECT..INTO assignments > > Hi Alexander, > Thanks for this idea, but for us, cpu is a critical hotspot. > I will evaluate how many statements in our application are affected. > If this number is not to high, we modify them. > In this case, we will need two sql_mode (one for MDEV-13417 and > one for MDEV-13418). > > Best regards. > Jérôme. > > >> -----Message d'origine----- >> De : Alexander Barkov [mailto:bar@mariadb.org] Envoyé : vendredi >> 3 novembre 2017 11:02 À : jerome brauge Cc : MariaDB Developers >> (maria-developers@lists.launchpad.net) >> Objet : Re: MDEV-13418 Compatibility: The order of evaluation of >> SELECT..INTO assignments >> >> Hello Jerome, >> >> >> On 11/03/2017 12:22 PM, jerome brauge wrote: >>> Hello Alexander. >>> I've begun to implement your proposal but now I'm not sure that >>> it's a >> better solution than mine. >>> Let me explain . >>> - first : number of temporary variables can be significant >>> because we don't >> know when they are really needed and their scope are local to >> the statement. >>> >>> declare b1 INTEGER; >>> declare res INTEGER; >>> ... >>> if b1 = 0 then >>> select 1,b1+1 into b1, res from dual; >>> end if; >>> if b1 = 1 then >>> select 2,b1+2 into b1, res from dual; >>> end if; >>> >>> will be transform in : >>> >>> declare b1 INTEGER; >>> declare res INTEGER; >>> ... >>> if b1 = 0 then >>> declare _b1 INTEGER default res; >>> declare _res INTEGER default res; >>> select 1,b1+1 into _b1, _res from dual; >>> set b1=_b1; >>> set res=_res; >>> -- _res is not needed, but we don't know because the select >>> statement is >> not parsed >>> end if; >>> if b1 = 1 then >>> declare _b1 INTEGER default res; >>> declare _res INTEGER default res; >>> select 2,b1+2 into b1, res from dual; >>> set b1=_b1; >>> set res=_res; >>> -- same thing here, and we have declare two variables for >>> each target >> variables >>> end if; >>> >>> Perhaps we could >>> - declare these temporary variables only one time in the first >>> frame of the stored procedure (may be tricky) >>> - parse columns of each select to know what variables are >>> really assigned and reused (heavy cost in cpu and time) >>> >>> - second : if we can't determine variables which must have a >>> temporary >> variable, number of sp_instr_set and sp_instr_set_var will be >> too high and their cpu cost is not acceptable. >>> >>> My first solution has a fixed memory impact (and memory is not >>> an issue >> nowadays), and especially a very light cpu cost. >>> >>> What do you think about ? >> >> I agree that determining variables which must have a temporary >> pair by full scanning the SELECT list expressions is probably >> not a good
idea.
>> >> Declaring these variables only one time is easy. >> I earlier made about the same trick with cursor parameters. >> See sp_pcontext::retrieve_field_definitions(). >> >> The idea is that we can put such backup variables into a >> separate child sp_pcontext frame, to make sure that only one >> backup variable exists if the real variable appears multiple >> time as a SELECT INTO target. Having a dedicated frame allows: >> - to add variables in the middle of a BEGIN..END block, without >> having to re- enumerate local variables of the same block. >> - handle unique names >> >> See the comment in sp_pcontext::retrieve_field_definitions() >> about frame positions and run-time positions, and the CURSOR >> declaration grammar in sql_yacc_ora.yy. >> >> >> Every sp_pcontext frame should have one sp_pcontext frame for backup >> variables, which should be put into m_children. >> >> sp_pcontext should probably have a new flag: >> bool m_is_backup; >> So we can iterate through m_children and find the backup frame. >> >> Another option is to add: >> >> int m_backup_child_context_index; >> >> which will store -1 if the current frame does not have a backup >> child yet, or a non-negative value meaning the index in m_children. >> So a new method could looks like this: >> >> sp_pcontext *get_backup_context() { >> if (m_backup_child_contex_index < 0) >> return 0; >> return m_children(m_backup_child_context_index); >> } >> >> >>> >>> Regard, >>> Jérôme. >>> >>> >>>> -----Message d'origine----- >>>> De : Alexander Barkov [mailto:bar@mariadb.org] Envoyé : lundi
30
>>>> octobre 2017 10:02 À : jerome brauge Objet : Re: MDEV-14139 >>>> Anchored data types for variables >>>> >>>> >>>> >>>> On 10/28/2017 07:29 PM, Alexander Barkov wrote: >>>>> >>>>> On 10/27/2017 10:27 PM, Alexander Barkov wrote: >>>>>> Hello Jerome, >>>>>> >>>>>> I have implemented "MDEV-14139 Anchored data types for > variables". >>>>>> and pushed it to bb-10.2-ext: >>>>>> >>>>>> >>>> >> >
https://github.com/MariaDB/server/commit/5dd5253f7e50c21fa758e2eb58f
>>>> 3 >>>>>> aa9c9754e733 >>>>>> >>>>>> So now it should be easier to implement consistent SET by >>>>>> creating backup variables. >>>>>> >>>>>> >>>>>> LEX::sp_variable_declarations_vartype_finalize() implements >>>>>> the logic which copies data type from another variable. >>>>>> >>>>>> >>>>>> The idea is that for all variables, which are assignment >>>>>> targets in a SET or a SELECT INTO statement, we create a backup variable. >>>>>> >>>>>> It will involve these calls for every such variable: >>>>>> >>>>>> LEX::sp_variable_declarations_init(thd, 1); >>>>>> sp_pcontext::add_variable(thd, backup_variable_name); >>>>>> LEX::sp_variable_declarations_vartype_finalize(thd, 1, >>>>>> orig_variable_name, def); >>>>>> >>>>>> where "def" is Item_splocal created for the original variable. >>>>> >>>>> Just an idea: instead of creating sp_instr_set, it's easier to >>>>> introduce a new class sp_instr_set_var, to copy the value from >>>>> one variable to another variable. >>>>> >>>>> This operation will not need neither Item, nor sp_lex_keeper. >>>>> It will only need two offsets: >>>>> - the source variable offset and >>>>> - the target variable offset. >>>>> >>>>> Using these offsets, we can access to >>>>> spcont->m_var_table->field[source] and >>>>> spcont->spcont->m_var_table->field[target] >>>>> and copy the value between them using Field::store_field(). >>>>> >>>>> This won't however for the ROW variables at the moment, because >>>>> ROW fields are stored in the Item_spvar_args::m_table member of >>>>> Item_field_row. >>>>> >>>>> It seems we need a new class Field_row and move Virtual_tmp_table >>>>> from Item_field_row to Field_row. >>>>> >>>>> I will try to do it. >>>> >>>> >>>> I have implemented "MDEV-14212 Add Field_row for SP ROW variables" >>>> and pushed to bb-10.2-ext. >>>> >>>> Also, added a comment: >>>> >>>> MDEV-13418 Compatibility: The order of evaluation of >>>> SELECT..INTO assignments >>>> >>>> >>>> Now, when MDEV-14212 is done, these declarations: >>>> >>>> DECLARE a_tmp TYPE OF a DEFAULT a; >>>> DECLARE b_tmp TYPE OF b DEFAULT b; >>>> >>>> and these assignments: >>>> >>>> SET a=a_tmp; >>>> SET b=b_tmp; >>>> >>>> can use a new sp_instr_setvar instead of sp_inst_set. >>>> >>>> The new command sp_instr_setvar can do direct copying between two >>>> spcont->m_var_table->field[XXX], without a need to create Item >>>> spcont->and > LEX. >>>> >>>>> >>>>>> >>>>>> Greetings. >>>>>>